Cassandra Interview Questions

Cassandra Interview Questions

1. Explain what is Cassandra?

Cassandra is an open-source data storage system developed at Facebook for inbox search and designed for storing and managing large amounts of data across commodity servers.

2. What is the use of Cassandra and why to use Cassandra?

Cassandra was designed to handle big data workloads across multiple nodes without any single point of failure. The various factors responsible for using Cassandra are:

It is fault-tolerant and consistent

Gigabytes to petabytes scalabilities

It is a column-oriented database

No single point of failure

No need for separate caching layer

Flexible schema design

It has flexible data storage, easy data distribution, and fast writes

It supports ACID (Atomicity, Consistency, Isolation, and Durability)properties

Multi-data center and cloud capable

Data compression

3. Explain what is composite type in Cassandra?

In Cassandra, composite type allows to define key or a column name with a concatenation of data of different type. You can use two types of Composite Type:

Row Key

Column Name

4. How Cassandra stores data?

All data is stored as bytes. When you specify validator, Cassandra ensures those bytes are encoded as per requirement then comparator orders the column based on the ordering specific to the encoding while composite is just byte arrays with a specific encoding, for each component, it stores a two-byte length followed by the byte encoded component followed by a termination bit.

5. Mention what are the main components of the Cassandra Data Model?

The main components of Cassandra Data Model are:

Cluster

Keyspace

Column

Column & Family

6. Explain what is a column family in Cassandra?

Column family in Cassandra is referred for a collection of Rows.

7. Explain what is a cluster in Cassandra?

A cluster is a container for keyspaces. Cassandra database is segmented over several machines that operate together. The cluster is the outermost container that arranges the nodes in a ring format and assigns data to them. These nodes have a replica that takes charge in case of data handling failure.

8. List out the other components of Cassandra?

The other components of Cassandra are:

Node

Data Center

Cluster

Commit log

Mem-table

SSTable

Bloom Filter

9. Explain what is a keyspace in Cassandra?

In Cassandra, a keyspace is a namespace that determines data replication on nodes. A cluster consists of one keyspace per node.

10. What is the syntax to create keyspace in Cassandra?

Syntax for creating keyspace in Cassandra is:

CREATE KEYSPACE <identifier> WITH <properties>

11. Mention what are the values stored in the Cassandra Column?

In Cassandra Column basically, there are three values:

Column Name

Value

Time Stamp

12. Mention when you can use Alter keyspace?

ALTER KEYSPACE can be used to change properties such as the number of replicas and the durable_write of a keyspace.

13. Explain what is Cassandra-Cqlsh?

Cassandra-Cqlsh is a query language that enables users to communicate with its database. By using Cassandra cqlsh, you can do following things:

Define a schema

Insert a data and

Execute a query

14. Mention what does the shell commands “Capture” and “Consistency” determines?

There are various Cqlsh shell commands in Cassandra. Command “Capture”, captures the output of a command and adds it to a file while, command “Consistency” display the current consistency level or set a new consistency level.

15. What is mandatory while creating a table in Cassandra?

While creating a table primary key is mandatory, it is made up of one or more columns of a table.

16. Mention what needs to be taken care while adding a Column?

While adding a column you need to take care that the

Column name is not conflicting with the existing column names

Table is not defined with compact storage option

17. Mention what is Cassandra- CQL collections?

Cassandra CQL collections help you to store multiple values in a single variable. In Cassandra, you can use CQL collections in following ways:

List: It is used when the order of the data needs to be maintained, and a value is to be stored multiple times (holds the list of unique elements)

SET: It is used for group of elements to store and returned in sorted orders (holds repeating elements)

MAP: It is a data type used to store a key-value pair of elements

18. Explain how Cassandra writes data?

Cassandra writes data in three components:

Commitlog write

Memtable write

SStable write

Cassandra first writes data to a commit log and then to an in-memory table structure memtable and at last in SStable

19. Explain what is Memtable in Cassandra?

Cassandra writes the data to a in memory structure known as Memtable

It is an in-memory cache with content stored as key/column

By key Memtable data are sorted

There is a separate Memtable for each ColumnFamily, and it retrieves column data from the key

20. Explain what is SStable consist of?

SStable consist of mainly 2 files:

Index file ( Bloom filter & Key offset pairs)

Data file (Actual column data)

21. Explain what is Bloom Filter is used for in Cassandra?

A bloom filter is a space efficient data structure that is used to test whether an element is a member of a set. In other words, it is used to determine whether an SSTable has data for a particular row. In Cassandra it is used to save IO when performing a KEY LOOKUP.

22. Explain how Cassandra writes changed data into commitlog?

Cassandra concatenate changed data to commitlog

Commitlog acts as a crash recovery log for data

Until the changed data is concatenated to commitlog write operation will be never considered successful

Data will not be lost once commitlog is flushed out to file

23. Explain how Cassandra delete Data?

SSTables are immutable and cannot remove a row from SSTables. When a row needs to be deleted, Cassandra assigns the column value with a special value called Tombstone. When the data is read, the Tombstone value is considered as deleted.

24. What is CQLSH? And why is it used?

Cassandra-Cqlsh is a query language that enables users to communicate with its database. By using Cassandra cqlsh, you can do following things:

Define a schema

Insert a data, and

Execute a query

25. What is a YAML file in Cassandra?

The cassandra.yaml file is the main configuration file for Cassandra. After changing properties in the cassandra.yaml file, you must restart the node for the changes to take effect.

26. What are durable writes?

Durable Writes provides a means to instruct Cassandra whether to use commitlog for updates on the current KeySpace or not.

This option is not mandatory. The default value for durable writes is TRUE.

27. Differentiate between Static and Dynamic CQL Tables.

A Static Table uses a relatively static set of column names and is similar to Relational Database Table.

A dynamic table allows you to pre-compute result sets and stores them in a single row for efficient data retrieval.

28. Differentiate between Drop and Truncate in CQLSH

The Drop table command drops specified table including all the data from the keyspace.

The Truncate table command is used to truncate a table and deletes all the rows of the table permanently.

29. What is Gossip Protocol?

Gossip Protocol in Cassandra is a peer-to-peer communication protocol in which nodes can choose among themselves with whom they want to exchange their state information. The nodes exchange information about themselves and about the other nodes that they have gossiped about, so all nodes quickly learn about all other nodes in the cluster.

DEPLOY TO CLOUDHUB	C4E	CLIENT ID ENFORCEMENT	CUSTOM POLICY	RABBIT MQ INTEGRATION
XML TO JSON	WEBSERVICE CONSUMER	VM CONNECTOR	VALIDATION	UNTIL SUCCESSFUL
SUB FLOW	SET & REMOVE VARIABLE	TRANSACTION ID	SCATTER GATHER	ROUND ROBIN
CONSUME REST WEBSERVICE	CRUD OPERATIONS	PARSE TEMPLATE	OBJECT TO JSON	LOAD STATIC RESOURCE
JSON TO XML	INVOKE	IDEMPOTENT FILTER	FOR EACH	FLAT TO JSON
FIXWIDTH TO JSON	FIRST SUCCESSFUL	FILE OPERATIONS	EXECUTE	ERROR HANDLING
EMAIL FUNCTIONALITY	DYNAMIC EVALUATE	CUSTOM BUSINESS EVENT	CSV TO JSON	COPYBOOK TO JSON
CHOICE	ASYNC

CMIS	JETTY	VM CONNECTOR	SALESFORCE	POP3
JMS	TCP/IP	WEBSERVICE CONSUMER	QUARTZ	MONGO DB
FILE CONNECTOR	DATABASE CONNECTOR

SUB FLOW	REQUEST REPLY	PROCESSOR CHAIN	FOR EACH	CACHE
ASYNC	TCP/IP	COMPOSITE SOURCE	POLL	UNTIL SUCCESSFUL
TRANSACTIONAL	FLOW

EXPRESSION	CXF	SCRIPT	RUBY	PYTHON
JAVASCRIPT	JAVA	INVOKE	CUSTOM BUSINESS EVENT	GROOVY
ECHO	LOGGER

MONGO DB	XSLT	TRANSFORMER REFERENCE	SCRIPT	RUBY
PYTHON	MESSAGE PROPERTIES	JAVA TRANSFORMER	GZIP COMPRESS/UNCOMPRESS	GROOVY
EXPRESSION	DOM TO XML	STRING VALIDATION	COMBINE COLLECTIONS	BYTE ARRAY TO STRING
ATTACHMENT TRANSFORMER	FILE TO STRING	XML TO DOM	APPEND STRING	JAVASCRIPT
JSON TO JAVA	COPYBOOK TO JSON	MAP TO JSON	JSON TO XML	FLATFILE TO JSON
FIXWIDTH TO JSON	CSV TO JSON

For BE/B.Tech/BCA/MCA/ME/M.Tech Major/Minor Project for CS/IT branch at minimum price Text Message @ 9424820157

Cassandra Interview Questions

No comments:

Post a Comment

Please go through below tutorials:

Mule 4 Tutorials

Widely used Connectors in Mule 3

Widely used Scopes in Mule 3

Widely used Components in Mule 3

Widely used Transformers in Mule 3

Widely used Filters in Mule 3

Exception Strategy in Mule 3

Flow Control in Mule 3

WILDCARD	SCHEMA VALIDATION	REGEX	PAYLOAD	OR
NOT	MESSAGE PROPERTY	MESSAGE	IDEMPOTENT	FILTER REFERNCE
EXPRESSION	EXCEPTION	CUSTOM	AND

CHOICE	COLLECTION AGGREGATOR	COLLECTION SPLITTER	CUSTOM AGGREGATOR	FIRST SUCCESSFUL
MESSAGE CHUNK AGGREGATOR	MESSAGE CHUNK SPLITTER	RESEQUENCER	ROUND ROBIN	SOAP ROUTER