ElasticSearch interview questions

ElasticSearch interview questions

Elasticsearch is a real-time distributed, RESTful search and analytics engine that built on the top of Apache Lucene which is a full-text search engine. you can see Elasticsearch as a distributed storage and that features Real-time Analytics. It is document oriented that stores objects as document and make then indexable so the content of documents is searchable.

Few Known Fact about ElasticSearch

  • Built on Top of Lucene (A full-text search engine by Apache )
  • Document-Oriented (Stores data structured JSON documents)
  • Full-Text Search (Supports Full-text search indexing which giving faster result retrieval)
  • Schema-Free (Uses NoSQL)
  • Restful API (Support Restful APIs for storage and retrieval of records)
  • Supports Autocompletion & Instant Search

Following are the list of Top 25 ElasticSearch Interview questions with their answers

Download ElasticSearch interview questions PDF

ElasticSearch interview questions

There are resource limitations like RAM, vCPU etc., for scale out, due to which applications employ multiple instances of Elasticsearch on separate machines.
Data in an index can be partitioned into multiple portions which are managed by a separate node or instance of Elasticsearch.Each such portion is called a Shard.And an Elasticsearch index has 5 shards by default.
Each shard in elastic search has again two copies of the shard that are called the replicas.
They serve the purpose of fault tolerance and high availability.
By using the command PUT before the index name, creates the index and if you want to add another index then use the command POST before the index name.
Ex: PUT website

An index named computer is created

To delete an index in Elasticsearch use the command DELETE /index name.

Ex: DELETE /website

By using GET / _index name/ indices we can get the list of indices present in the cluster.
Basically, Elasticsearch will automatically create the mapping according to the data provided by the user in the request body. Its bulk functionality can be used to add more than one JSON object in the index.

Ex: POST website /_bulk

To retrieve a document in Elasticsearch, we use the GET verb followed by the _index, _type, _id.
Ex: GET / computer / blog / 123?=pretty
The Boolean model is used by the Lucene to find the similar documents, and a formula called practical scoring function is used to calculate the relevance.
This formula copies concepts from the inverse document/term-document frequency and the vector space model and adds the modern features like coordination factor, field length normalization as well.
Score (q, d) is the relevance score of document “d” for query “q”.
We can perform the following searches in Elasticsearch:
  • Multi-index, Multitype search: All search APIs can be applied across all multiple indices with the support for the multi-index system.
    We can search certain tags across all indices as well as all across all indices and all types.
  • URI search: A search request is executed purely using a URI by providing request parameters.
  • Request body search:A search request can be executed by a search DSL, that includes the query DSL within the body.
The Queries are divided into two types with multiple queries categorized under them.
  • Full-text queries: Match Query, Match phrase Query, Multi match Query, Match phrase prefix Query, common terms Query, Query string Query, simple Query String Query.
  • Term level queries: term Query, term set Query, terms Query, Range Query, Prefix Query, wildcard Query, regexp Query, fuzzy Query, exists Query, type Query, ids Query.
  • Term-based Queries : Queries like the term query or fuzzy query are the low-level queries that do not have analysis phase.A term Query for the term Foo searches for the exact term in the inverted index and calculates the IDF/TF relevance score for every document that has a term.
  • Full-text Queries : Queries like match query or query string queries are the high-level queries that understand that mapping of a field.As soon as the query assembles the complete list of items it executes the appropriate low-level query for every term, and finally combines their results to produce the relevance score of every document.
The aggregation framework provides aggregated data based on search query.It can be seen as a unit of work that builds analytic information over the set of documents.There are different types of aggregations with different purpose and outputs.
Elasticsearch is a distributed documented store with several directories.It can store and retrieve the complex data structures that are serialized as JSON documents in real time.
Yes, Elasticsearch can be used as a replacement for a database as the Elasticsearch is very powerful.
It offers features like multitenancy, sharding and Replication, distribution and cloud Realtime get, Refresh, commit, versioning and re-indexing and many more, which make it an apt replacement of a database.
Generally, Elasticsearch uses the port range of 9200-9300.
So, to check if it is running on your server just type the URL of the homepage followed by the port number.
Ex: mysitename.com:9200