mirror of
https://github.com/kamranahmedse/developer-roadmap.git
synced 2026-03-12 17:51:53 +08:00
chore: sync content to repo (#9409)
Co-authored-by: kamranahmedse <4921183+kamranahmedse@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
parent
df5cdf244f
commit
b544d56e09
@@ -0,0 +1,3 @@
|
||||
# API Keys
|
||||
|
||||
API keys in Elasticsearch provide a mechanism for authentication and authorization, allowing users or applications to securely access Elasticsearch APIs. They are a more granular alternative to using usernames and passwords, enabling you to restrict access to specific resources and actions. API keys can be configured with specific roles and privileges, limiting what a user or application can do within the Elasticsearch cluster.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Authentication
|
||||
|
||||
Authentication is the process of verifying the identity of a user or system attempting to access a resource. It ensures that only authorized individuals or applications can gain entry by requiring them to prove who they are, typically through credentials like usernames and passwords, API keys, or certificates. This process confirms that the user or system is indeed who they claim to be before granting access.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Autoscaling
|
||||
|
||||
Autoscaling is the ability of a system to automatically adjust its resources (like compute, memory, or storage) based on the current demand. This means that the system can scale up (add more resources) when demand increases and scale down (remove resources) when demand decreases, all without manual intervention. This ensures optimal performance and cost efficiency by only using the resources that are actually needed.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Avg, Sum, Min, and Max Aggregations
|
||||
|
||||
These aggregations are fundamental tools for calculating statistical summaries of numerical data. They compute the average (Avg), total (Sum), smallest value (Min), and largest value (Max) respectively, across a set of documents that match a query. These aggregations operate on numeric fields within your Elasticsearch indices, providing insights into the distribution and range of your data.
|
||||
@@ -0,0 +1,3 @@
|
||||
# BM25 Algorithm
|
||||
|
||||
BM25 (Best Matching 25) is a ranking function used by search engines to estimate the relevance of documents to a given search query. It's a bag-of-words retrieval function that scores documents based on the query terms appearing in each document, taking into account term frequency and document length. The algorithm adjusts for document length, preventing longer documents from being unfairly favored, and also considers how frequently a term appears in the entire collection of documents.
|
||||
@@ -0,0 +1,7 @@
|
||||
# Boolean Data Type
|
||||
|
||||
A boolean data type represents a logical value, which can be either true or false. It's used to store binary information, indicating whether a condition is met or not, or representing a simple yes/no state. This data type is fundamental for filtering, decision-making, and representing flags within a dataset.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Boolean field type](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/boolean)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Boosting Queries
|
||||
|
||||
Boosting queries in Elasticsearch allows you to influence the relevance score of documents based on specific criteria. It works by increasing or decreasing the score of documents that match certain query clauses, effectively prioritizing some results over others. This helps to fine-tune search results to better align with user intent and improve the overall precision of your search.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Bulk Indexing
|
||||
|
||||
Bulk indexing in Elasticsearch is a way to send multiple indexing, updating, or deleting operations to the Elasticsearch cluster in a single request. Instead of sending each document individually, you batch them together, which significantly reduces the overhead of network communication and processing, leading to faster indexing speeds. This approach is particularly useful when dealing with large datasets or when needing to ingest data quickly.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Cardinality Aggregation
|
||||
|
||||
Cardinality aggregation is used to estimate the number of unique values in a field. It's particularly useful when you need to count distinct items but don't need the actual unique values themselves. This aggregation provides an approximate count, balancing accuracy with performance, especially when dealing with large datasets.
|
||||
@@ -0,0 +1,3 @@
|
||||
# CAT API
|
||||
|
||||
The CAT API in Elasticsearch provides a simple, human-readable way to access cluster-level information using a command-line interface or a RESTful API. It returns data in a tabular format, making it easy to understand and interpret the status, health, and performance metrics of your Elasticsearch cluster. This API is primarily used for monitoring and troubleshooting purposes.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Cluster Monitoring
|
||||
|
||||
Cluster monitoring involves continuously observing the health, performance, and resource utilization of an Elasticsearch cluster. This process helps identify potential issues, bottlenecks, and anomalies that could impact the cluster's stability and responsiveness. Effective monitoring allows administrators to proactively address problems, optimize resource allocation, and ensure the cluster operates efficiently.
|
||||
@@ -0,0 +1,11 @@
|
||||
# Cluster (System)
|
||||
|
||||
A cluster is a collection of one or more Elasticsearch nodes that work together to store and process data. It provides a distributed and scalable system where data is divided into shards and distributed across multiple nodes for redundancy and performance. The cluster manages indexing, searching, and analysis operations across all nodes, presenting a unified view of the data to the user.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Deploy an Elasticsearch cluster](http://elastic.co/docs/deploy-manage/deploy/self-managed/installing-elasticsearch)
|
||||
- [@official@Clusters, nodes, and shards](https://www.elastic.co/docs/deploy-manage/distributed-architecture/clusters-nodes-shards)
|
||||
- [@article@How to setup and install Elasticsearch: From a single node to a cluster of nodes](http://severalnines.com/blog/how-to-setup-and-install-elasticsearch-cluster/)
|
||||
- [@article@Mastering the Art of Elasticsearch Cluster Setup](https://opster.com/guides/elasticsearch/operations/elasticsearch-cluster-setup/)
|
||||
- [@video@Elasticsearch basic concepts | cluster, shards, nodes | Elasticsearch tutorial for beginners](https://www.youtube.com/watch?v=GH6hO2L4LR0)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Controlling Search Results
|
||||
|
||||
Controlling search results involves influencing the order and relevance of documents returned by a search query. This includes techniques to boost the score of certain documents, filter out unwanted results, and tailor the search experience to meet specific user needs. It allows for fine-tuning the search process beyond basic keyword matching.
|
||||
@@ -0,0 +1,8 @@
|
||||
# Coordinating Nodes
|
||||
|
||||
Coordinating nodes in Elasticsearch are like traffic controllers. They receive client requests, route them to the appropriate data nodes that hold the relevant data shards, and then consolidate the results before sending them back to the client. These nodes don't hold any data themselves, but they play a crucial role in distributing the workload and ensuring efficient query execution across the cluster.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Coordinating node](https://www.elastic.co/docs/deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles#coordinating-node)
|
||||
- [@official@Coordinating only node](https://www.elastic.co/docs/deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles#coordinating-only-node-role)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Create Index
|
||||
|
||||
Creating an index in Elasticsearch using the Document API involves sending a PUT request to the Elasticsearch server. This request specifies the name of the index you want to create. You can also include settings and mappings in the request body to configure how the index should store and analyze your data. If the index doesn't already exist, Elasticsearch will create it based on the provided configuration. If the index exists, you will get an error.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Cross-Cluster Replication
|
||||
|
||||
Cross-cluster replication (CCR) allows you to replicate indices and their data from one Elasticsearch cluster to another. This enables scenarios like disaster recovery, where a secondary cluster can take over if the primary fails, and data locality, where data is replicated closer to users in different geographic regions for faster access. CCR ensures data consistency across clusters, providing a reliable and efficient way to maintain data availability and resilience.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Custom Analyzers
|
||||
|
||||
Custom analyzers in Elasticsearch provide a way to define how text is processed both when indexing documents and when searching. They allow you to combine character filters, tokenizers, and token filters in a specific order to tailor the analysis process to your specific needs, such as handling language-specific nuances or removing unwanted characters. This customization ensures that your search results are more relevant and accurate.
|
||||
@@ -0,0 +1,7 @@
|
||||
# Data Nodes
|
||||
|
||||
Data nodes in Elasticsearch are the workhorses of the cluster, responsible for storing data and performing CPU and I/O intensive operations like searching, indexing, and data analysis. These nodes hold shards of Elasticsearch indices and manage the actual data storage on disk. They contribute significantly to the cluster's overall performance and scalability.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Data nodes](https://www.elastic.co/docs/deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles#data-node-role)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Data Tiers
|
||||
|
||||
Data tiers in Elasticsearch refer to the strategy of categorizing and storing data based on its access frequency and importance. This approach involves segregating data into different storage types (like hot, warm, cold, and frozen) to optimize performance, cost, and resource utilization. By aligning data storage with its usage patterns, organizations can efficiently manage large volumes of data while maintaining acceptable query speeds and minimizing infrastructure expenses.
|
||||
@@ -0,0 +1,8 @@
|
||||
# Data Types
|
||||
|
||||
Data types define the kind of values that can be stored in a field. They specify how Elasticsearch should interpret and store the data, influencing how it can be searched and analyzed. Common examples include text, numbers, dates, booleans, and geo-locations, each optimized for different use cases.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Field data types](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/field-data-types)
|
||||
- [@article@Elasticsearch For Dummies Part 2: Datatypes](https://tim-estes.medium.com/elasticsearch-for-dummies-part-2-datatypes-c7a9494b48e8)
|
||||
@@ -0,0 +1,8 @@
|
||||
# Dates
|
||||
|
||||
Dates in Elasticsearch represent points in time. They are stored internally as the number of milliseconds since the Unix epoch (January 1, 1970, 00:00:00 UTC). Elasticsearch provides flexibility in how you format date values when indexing documents, allowing you to use strings in various formats or numeric values representing milliseconds since the epoch. When querying, you can use date ranges and other date-specific operations to filter and analyze your data based on time.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Date field type](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/date)
|
||||
- [@official@Date nanoseconds field type](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/date_nanos)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Delete by Query
|
||||
|
||||
Delete by Query allows you to remove documents from an Elasticsearch index that match a specific query. Instead of deleting documents individually by their ID, you can define criteria based on field values or other search parameters. This is useful for removing outdated, irrelevant, or incorrect data from your index in bulk.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Delete Documents
|
||||
|
||||
Deleting a document in Elasticsearch involves sending a DELETE request to a specific index and document ID. This action permanently removes the document from the index. After a successful deletion, the document will no longer be searchable. The operation requires specifying the index name and the unique identifier of the document you wish to remove.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Delete Index
|
||||
|
||||
Deleting an index in Elasticsearch removes the entire index and all its associated data. You can achieve this using the Delete Index API. Simply send a DELETE request to the index's name endpoint (e.g., `DELETE /your_index_name`). This action is permanent and irreversible, so it's crucial to ensure you're deleting the correct index and have a backup if needed.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Doc Values
|
||||
|
||||
Doc values are a data structure in Elasticsearch that stores field values in a column-oriented fashion, optimized for aggregations, sorting, and scripting. Instead of storing the data alongside the inverted index, doc values are stored separately on disk, making them efficient for retrieving values for a large number of documents. This allows Elasticsearch to perform operations like sorting and aggregations much faster than if it had to retrieve the data from the inverted index.
|
||||
@@ -0,0 +1,10 @@
|
||||
# Document (Row)
|
||||
|
||||
A document is a basic unit of information in Elasticsearch, analogous to a row in a relational database table. It's a JSON object containing a set of fields, each with a name and one or more values. These fields can hold various data types like text, numbers, dates, booleans, and even nested objects or arrays.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Index basics](https://www.elastic.co/docs/manage-data/data-store/index-basics)
|
||||
- [@article@Elasticsearch Document](https://www.dremio.com/wiki/elasticsearch-document/)
|
||||
- [@article@Elasticsearch Document](https://opster.com/guides/elasticsearch/glossary/elasticsearch-document/)
|
||||
- [@video@How Elasticsearch Works: Documents, JSON & Index Explained](https://www.youtube.com/watch?v=wHZ3JsRzukI)
|
||||
@@ -0,0 +1,10 @@
|
||||
# Dynamic Mappings
|
||||
|
||||
Dynamic mapping in Elasticsearch allows the index to automatically detect and add new fields to the mapping when new documents containing previously unseen fields are indexed. This means you don't have to predefine the schema for every field in your data; Elasticsearch infers the data type and adds the field to the index mapping on the fly. This is useful for quickly indexing data without upfront schema design.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Dynamic mapping](https://www.elastic.co/docs/manage-data/data-store/mapping/dynamic-mapping)
|
||||
- [@official@Dynamic field mapping](https://www.elastic.co/docs/manage-data/data-store/mapping/dynamic-field-mapping)
|
||||
- [@official@Elasticsearch Dynamic Mapping: Advanced Insights and Best Practices](https://opster.com/guides/elasticsearch/data-architecture/elasticsearch-dynamic-mapping/)
|
||||
- [@video@Dynamic index mappings in Elasticsearch and OpenSearch](https://www.youtube.com/watch?v=KBMTES9lMOM)
|
||||
@@ -0,0 +1,9 @@
|
||||
# Elastic Cloud
|
||||
|
||||
Elastic Cloud is a suite of Elasticsearch-based services offered by Elastic, the company behind Elasticsearch. It provides a managed platform for deploying, managing, and scaling Elasticsearch clusters in the cloud, eliminating the need for users to handle the underlying infrastructure. This includes tasks like provisioning servers, configuring networking, and managing backups, allowing users to focus on analyzing and visualizing their data.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Accelerate results in Elastic Cloud](https://www.elastic.co/cloud)
|
||||
- [@official@Elastic Cloud Serverless](https://www.elastic.co/cloud/serverless)
|
||||
- [@video@Getting Started with Elasticsearch Service and Elastic Cloud](https://www.youtube.com/watch?v=mIHYcxe70fc)
|
||||
@@ -0,0 +1,14 @@
|
||||
# Elasticsearch Usecases
|
||||
|
||||
Elastic use cases can be classified into three main categories:
|
||||
* **Elasticsearch** is a distributed, open-source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured.
|
||||
* **Elastic Observability** builds on this foundation to provide a unified view of logs, metrics, and traces, enabling users to monitor and troubleshoot their systems.
|
||||
* **Elastic Security** leverages Elasticsearch's search and analytics capabilities to offer threat detection, prevention, and response, helping organizations protect themselves from cyber threats. Elasticsearch use cases are diverse, ranging from application search and website search to logging and log analytics, security analytics, and business analytics.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Elasticsearch solution overview](https://www.elastic.co/docs/solutions/search)
|
||||
- [@official@Elastic Observability overview](https://www.elastic.co/docs/solutions/observability)
|
||||
- [@official@Elastic Security overview](https://www.elastic.co/docs/solutions/security)
|
||||
- [@video@Getting Started with Elastic Observability](https://www.youtube.com/watch?v=SWUgqOSAyqU)
|
||||
- [@video@Elastic Security Solutions Overview](https://www.youtube.com/watch?v=wzPMtmINEhU)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Event Query Language (EQL)
|
||||
|
||||
Event Query Language (EQL) is a powerful query language designed for security event analysis and threat hunting. It allows users to search for sequences of events that match specific patterns, enabling the detection of complex attack behaviors. EQL focuses on identifying relationships and dependencies between events over time, making it well-suited for uncovering malicious activities within large datasets.
|
||||
@@ -0,0 +1,3 @@
|
||||
# ES|QL
|
||||
|
||||
ES|QL is a query language designed for Elasticsearch that allows users to search, transform, and analyze data using a SQL-like syntax. It provides a more familiar and accessible way to interact with Elasticsearch data compared to the traditional JSON-based query DSL, enabling users to perform complex data manipulations and aggregations with relative ease.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Exists Query
|
||||
|
||||
An exists query in Elasticsearch is used to find documents that contain a specific field, regardless of its value. It checks for the presence of the field in the document's source data. This is useful when you need to filter documents based on whether a particular field has been defined or not.
|
||||
@@ -0,0 +1,8 @@
|
||||
# Explicit Mappings
|
||||
|
||||
Explicit mappings in Elasticsearch involve defining the structure and data types of fields within an index before indexing any documents. This allows you to have precise control over how Elasticsearch analyzes and stores your data, ensuring that fields are treated as intended (e.g., a field containing dates is treated as a date, not just text). By explicitly defining mappings, you can optimize search performance and data integrity.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Explicit mapping](https://www.elastic.co/docs/manage-data/data-store/mapping/explicit-mapping)
|
||||
- [@video@Explicit index mappings in Elasticsearch and OpenSearch](https://www.youtube.com/watch?v=KRd4Ud-5_wM)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Fielddata
|
||||
|
||||
Fielddata is an on-disk data structure used by Elasticsearch to enable aggregations, sorting, and scripting on text fields. Because text fields are analyzed (broken down into individual terms), Elasticsearch needs a way to quickly access all the terms for a specific document when performing these operations. Fielddata loads all the terms for a field into memory, allowing for fast access during these operations.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Filter Aggregations
|
||||
|
||||
Filter aggregations narrow down the documents that are used to calculate metrics within an aggregation. They work by applying a filter to the documents before the aggregation is performed, effectively creating a subset of the data for analysis. This allows you to focus on specific segments of your data and gain insights into particular subsets of your documents.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Filter
|
||||
|
||||
A filter in Elasticsearch is a query that returns documents matching specific criteria in a boolean (yes/no) manner. Unlike regular queries that calculate a relevance score, filters simply determine whether a document matches the condition or not. They are often used to narrow down the search results based on specific attributes or ranges, such as price, date, or category.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Bool Query Filter Context
|
||||
|
||||
The `filter` context within a Bool query in Elasticsearch is used to narrow down the documents that match a query without affecting the relevance score. It's like a pre-filter that efficiently excludes documents that don't meet specific criteria before the scoring process even begins, making it ideal for exact matches, range queries, and other conditions where relevance isn't a factor.
|
||||
@@ -0,0 +1,9 @@
|
||||
# Flattened Data Type
|
||||
|
||||
The flattened data type in Elasticsearch allows you to index an entire JSON object as a single field. This is useful when you have objects with many fields, but you only need to search or aggregate on a small subset of them. Instead of mapping each individual field, the flattened type indexes the entire object as a string, enabling you to query specific values within the object using specialized queries.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Flattened field type](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/flattened)
|
||||
- [@article@Flattened Datatype Mappings — Elasticsearch Tutorial](https://alirezadp10.medium.com/flattened-datatype-mappings-elasticsearch-tutorial-1cf77497e706)
|
||||
- [@video@Flattened Datatype](https://www.youtube.com/watch?v=UhPaEMR4pJ4)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Function Score Query
|
||||
|
||||
The Function Score Query allows you to modify the score of documents retrieved by a query. It provides a way to apply a function to each document that matches the base query, influencing its final relevance score. This function can be based on factors like document fields, pre-defined weights, or even custom scripts, enabling fine-grained control over search results ranking.
|
||||
@@ -0,0 +1,7 @@
|
||||
# Geo Points
|
||||
|
||||
Geo points are a specific data type in Elasticsearch used to store and index latitude and longitude coordinates. They allow you to represent locations on Earth and perform geospatial queries, such as finding points within a certain distance of a location or identifying points within a defined area. These coordinates are typically stored as a pair of numbers, with latitude representing the north-south position and longitude representing the east-west position.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Geopoint field type](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/geo-point)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Get Document
|
||||
|
||||
To retrieve a specific document from an Elasticsearch index, you need to know its unique identifier. You can then use the Get API, providing the index name and the document ID. Elasticsearch will then search for the document with that ID within the specified index and return it. The response will include the document's source data (the fields and their values), along with metadata like the index, ID, version, and whether the document was found.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Highlighting
|
||||
|
||||
Highlighting in Elasticsearch helps users quickly identify the search terms within the returned documents. It works by surrounding the search keywords in the results with special tags, like `<em>` and `</em>`, making them visually distinct. This allows users to easily see why a particular document matched their query without having to read the entire document. You can customize the tags used for highlighting, the fields that are highlighted, and even the way the highlighting is performed to suit your specific needs.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Histogram Aggregation
|
||||
|
||||
A histogram aggregation calculates the distribution of numeric values across a set of intervals, or "buckets." It groups data into these buckets based on their values, providing a count of how many data points fall within each bucket's range. This allows you to visualize the frequency of values within specific ranges, revealing patterns and trends in your data.
|
||||
@@ -0,0 +1,3 @@
|
||||
# How Search Works
|
||||
|
||||
Search, at its core, involves matching a user's query against the data stored in an index. This process typically begins with the user entering a search term, which is then analyzed and processed. The system then retrieves documents that contain terms matching the processed query, ranking them based on relevance to present the most suitable results to the user.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Hybrid Search
|
||||
|
||||
Hybrid search combines multiple search techniques to improve the relevance and accuracy of search results. It leverages the strengths of different approaches, such as keyword-based search and semantic search, to provide a more comprehensive and nuanced understanding of the user's query and the available data. By blending these methods, hybrid search aims to overcome the limitations of any single approach and deliver more relevant and meaningful results.
|
||||
@@ -0,0 +1,9 @@
|
||||
# ID (Primary Key)
|
||||
|
||||
An ID, or Primary Key, is a unique identifier for each document stored within an Elasticsearch index. It distinguishes one document from another, allowing for specific retrieval, updating, and deletion of individual data entries. This unique identifier is crucial for maintaining data integrity and enabling efficient data management within the Elasticsearch system.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@_id field](http://elastic.co/docs/reference/elasticsearch/mapping-reference/mapping-id-field)
|
||||
- [@official@Index basics](https://www.elastic.co/docs/manage-data/data-store/index-basics)
|
||||
- [@official@Get a document by its ID](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-get)
|
||||
@@ -0,0 +1,3 @@
|
||||
# ID Query
|
||||
|
||||
An ID query retrieves documents from an index based on their unique identifier. It's a simple and efficient way to fetch specific documents when you already know their IDs. This query directly accesses the document using its `_id` field.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Index Lifecycle Management (ILM)
|
||||
|
||||
Index Lifecycle Management (ILM) automates the process of managing Elasticsearch indices over time. It defines policies to control how indices are stored, moved, and deleted based on factors like age, size, or performance. This helps optimize resource utilization, reduce storage costs, and ensure data is available when needed.
|
||||
@@ -0,0 +1,11 @@
|
||||
# Index (Database)
|
||||
|
||||
An index is a collection of documents that have similar characteristics. Think of it as a database in a relational database system. It's where Elasticsearch stores and organizes data, allowing for efficient searching and retrieval. Each index is identified by a name, which is used when performing indexing, searching, updating, and deleting operations.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Index basics](https://www.elastic.co/docs/manage-data/data-store/index-basics)
|
||||
- [@official@What is an Elasticsearch index?](https://www.elastic.co/docs/manage-data/data-store/index-basics)
|
||||
- [@article@Elasticsearch Index – How to create, list, query and delete indices](https://opster.com/guides/elasticsearch/glossary/elasticsearch-index/)
|
||||
- [@video@How Elasticsearch Works: Documents, JSON & Index Explained](https://www.youtube.com/watch?v=wHZ3JsRzukI)
|
||||
- [@video@What's ElasticSearch Used For? | Search Indexes | Systems Design Interview 0 to 1 with Ex-Google SWE](https://www.youtube.com/watch?v=wmCWCVAl1Us)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Index Document
|
||||
|
||||
To add data to Elasticsearch, you use the Index API. This API lets you create a new document within a specific index. You need to specify the index name, a unique ID for the document (or let Elasticsearch generate one), and the document's content in JSON format. When you send this information to Elasticsearch via a PUT or POST request, it analyzes the data, indexes it, and makes it searchable.
|
||||
@@ -0,0 +1,13 @@
|
||||
# Elasticsearch
|
||||
|
||||
Elasticsearch is a distributed, open-source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. It's built on Apache Lucene and provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is commonly used for log analytics, full-text search, security intelligence, business analytics, and operational intelligence use cases.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Elasticsearch](https://www.elastic.co/elasticsearch)
|
||||
- [@official@Elasticsearch solution overview](https://www.elastic.co/docs/solutions/search)
|
||||
- [@official@Get started with Elasticsearch](https://www.elastic.co/docs/solutions/search/get-started)
|
||||
- [@official@Elasticsearch Labs Tutorial](https://www.elastic.co/search-labs/tutorials)
|
||||
- [@article@Elasticsearch Tutorial](https://www.tutorialspoint.com/elasticsearch/index.htm)
|
||||
- [@video@Elasticsearch Course for Beginners](https://www.youtube.com/watch?v=a4HBKEda_F8)
|
||||
- [@book@Elasticsearch The Definitive Guide](https://hlaszny.com/booksAndPapers/buckets/b8_IT/elasticsearch-the-definitive-guide.pdf)
|
||||
@@ -0,0 +1,11 @@
|
||||
# JSON
|
||||
|
||||
JSON (JavaScript Object Notation) is a lightweight data-interchange format that is easy for humans to read and write and easy for machines to parse and generate. It's based on a subset of the JavaScript programming language, and uses a text-based format to represent data objects consisting of attribute-value pairs and array data types. JSON is commonly used for transmitting data in web applications (e.g., sending some data from the server to the client, so it can be displayed on a web page) and is a standard format for APIs and configuration files.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@article@Introducing JSON](https://www.json.org/json-en.html)
|
||||
- [@article@JavaScript JSON](https://www.w3schools.com/js/js_json.asp)
|
||||
- [@article@Working with JSON](https://developer.mozilla.org/en-US/docs/Learn_web_development/Core/Scripting/JSON)
|
||||
- [@video@How Elasticsearch Works: Documents, JSON & Index Explained](https://www.youtube.com/watch?v=wHZ3JsRzukI)
|
||||
- [@video@What Is JSON | Explained](https://www.youtube.com/watch?v=cj3h3Fb10QY)
|
||||
@@ -0,0 +1,7 @@
|
||||
# Keyword Data Type
|
||||
|
||||
The Keyword data type in Elasticsearch is used for indexing fields that contain structured, string-based data. Unlike the Text data type, Keyword fields are not analyzed or tokenized; the entire string is indexed as a single term. This makes them ideal for filtering, sorting, and exact-match queries, where you need to find documents with a specific, complete value.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Keyword type family](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/keyword)
|
||||
@@ -0,0 +1,8 @@
|
||||
# Kibana Console
|
||||
|
||||
Kibana is a web interface that allows you to explore, visualize, and manage data indexed in Elasticsearch. It provides tools for searching, analyzing, and visualizing your data in real-time. Through Kibana, you can create dashboards, charts, and maps to gain insights from your Elasticsearch data. It also offers features for managing your Elasticsearch cluster, including monitoring its health and performance.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Elastic Console](https://www.elastic.co/docs/explore-analyze/query-filter/tools/console)
|
||||
- [@video@Kibana Dev Tools: Overview, Usage & Examples - Daily Elastic Byte S02E05](https://www.youtube.com/watch?v=ZiHiH3wfgas)
|
||||
@@ -0,0 +1,3 @@
|
||||
# KQL
|
||||
|
||||
Kibana Query Language (KQL) is a query language used within Kibana to search and filter data in Elasticsearch. It allows users to construct queries using a human-readable syntax, making it easier to find specific information within their Elasticsearch indices without needing to write complex JSON-based Elasticsearch queries. KQL supports features like free text search, field-based filtering, boolean operators, and range queries.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Latest Transformation
|
||||
|
||||
The "latest" transformation in Elasticsearch is used to identify and extract the most recent document within a group of documents that share a common field value. It allows you to find the most up-to-date information for each unique entity based on a specified sorting criteria, such as a timestamp or version number. This is particularly useful when dealing with time-series data or scenarios where you need to retrieve the latest state of an object.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Leaf vs. Compound Queries
|
||||
|
||||
Leaf queries in Elasticsearch target specific fields with simple search criteria, like finding documents where a field matches a particular value or falls within a certain range. Compound queries, on the other hand, combine multiple leaf or other compound queries to create more complex search logic, allowing you to specify how these individual queries should interact (e.g., must all match, at least one must match, or none should match).
|
||||
@@ -0,0 +1,3 @@
|
||||
# Lucene Query Syntax
|
||||
|
||||
Lucene is a powerful text search engine library. Its query syntax provides a way to specify search criteria using terms, phrases, wildcards, and boolean operators. This allows users to perform complex searches within text-based data, going beyond simple keyword matching to define precise and nuanced search requirements.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Mapping Explosion
|
||||
|
||||
Mapping explosion in Elasticsearch refers to the uncontrolled growth of fields within an index's mapping. This typically happens when Elasticsearch automatically creates mappings for new fields as it encounters them in incoming documents. Suppose a large number of unique and unexpected field names are introduced. In that case, the index mapping can become excessively large, consuming significant memory and impacting cluster performance due to increased resource usage during mapping updates and search operations.
|
||||
@@ -0,0 +1,11 @@
|
||||
# Mappings
|
||||
|
||||
Mappings are like schemas in relational databases; they define how a document and its fields are stored and indexed. They specify the data type of each field (like text, keyword, date, or number) and how Elasticsearch should handle that data for searching and analysis. Mappings are crucial for ensuring data is indexed correctly and that queries return accurate and relevant results.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Mapping](https://www.elastic.co/docs/manage-data/data-store/mapping)
|
||||
- [@article@Elasticsearch Mapping](https://opster.com/guides/elasticsearch/glossary/elasticsearch-mapping/)
|
||||
- [@article@[Beginner's guide] Understanding mapping with Elasticsearch and Kibana](https://dev.to/lisahjung/beginner-s-guide-understanding-mapping-with-elasticsearch-and-kibana-3646)
|
||||
- [@video@What Are Mappings in Elasticsearch? (Explained Simply)](https://www.youtube.com/watch?v=ryXCer_rJcg)
|
||||
- [@video@Beginner’s Crash Course to Elastic Stack - Part 5: Mapping](https://www.youtube.com/watch?v=FQAHDrVwfok)
|
||||
@@ -0,0 +1,9 @@
|
||||
# Master-Eligible Nodes
|
||||
|
||||
Master-eligible nodes in Elasticsearch are the nodes that can be elected as the master node. The master node is responsible for cluster-wide management tasks, such as creating or deleting indices, tracking which nodes are part of the cluster, and deciding how to allocate shards across the cluster. These nodes participate in the master election process and have the potential to become the cluster's central controller.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Node roles](https://www.elastic.co/docs/deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles)
|
||||
- [@article@Elasticsearch Nodes](https://opster.com/guides/elasticsearch/glossary/elasticsearch-node/)
|
||||
- [@video@Adding Nodes to an Elasticsearch Cluster](https://www.youtube.com/watch?v=XyQ4AN1Jn78)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Match Phrase Query
|
||||
|
||||
The Match Phrase query searches for documents that contain the exact phrase specified in the query. This means the terms must appear in the precise order and be adjacent to each other, as defined in the query string. It's a stricter form of matching compared to a standard match query, which only requires the terms to be present in the document, regardless of their order or proximity.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Match Query
|
||||
|
||||
The Match Query is a fundamental full-text search query in Elasticsearch. It allows you to search for documents that contain specific terms within a field. It analyzes the query string provided, breaking it down into individual terms based on the field's analyzer, and then searches for those terms in the specified field.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Must Queries
|
||||
|
||||
A "must" query in Elasticsearch is a type of compound query that specifies conditions that documents must satisfy to be included in the search results. It contributes to the relevance score of each matching document. Essentially, it acts as a mandatory filter, ensuring that only documents matching the specified criteria are returned.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Must_Not Queries
|
||||
|
||||
`must_not` is a clause within a `bool` query that filters out documents matching the specified query. It defines conditions that documents should *not* satisfy to be included in the search results. Essentially, it excludes documents that would otherwise be considered relevant based on other clauses in the `bool` query.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Nested Aggregations
|
||||
|
||||
Nested aggregations allow you to perform aggregations on nested objects within your documents. These nested objects are stored as separate documents internally by Elasticsearch, and nested aggregations provide a way to access and analyze the data within these nested structures as if they were part of the parent document. This is particularly useful when you have complex data structures where related information is embedded within a single document.
|
||||
@@ -0,0 +1,9 @@
|
||||
# Nested Data Type
|
||||
|
||||
The nested data type is used to represent arrays of objects within a document. Each object in the array can be indexed as a separate document, allowing you to query and filter based on the properties of individual objects within the array, without affecting other objects in the same array. This is particularly useful when you need to perform complex queries on related objects stored within a single document.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Nested field type](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/nested)
|
||||
- [@video@Nested vs object elasticsearch | How do I query nested objects in Elasticsearch?](https://www.youtube.com/watch?v=YIFDzfImSF8)
|
||||
- [@video@Querying Nested Objects in Elasticsearch](https://www.youtube.com/watch?v=UeAHBLJDFR8)
|
||||
@@ -0,0 +1,10 @@
|
||||
# Node (Instance)
|
||||
|
||||
A node is a single server within an Elasticsearch cluster that stores data and participates in the cluster's indexing and search capabilities. Each node is configured with a name and can be assigned specific roles, such as master, data, or ingest, to optimize resource allocation and cluster performance. Nodes communicate with each other to distribute data, manage cluster state, and handle search requests.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Node settings](https://www.elastic.co/docs/reference/elasticsearch/configuration-reference/node-settings)
|
||||
- [@official@Clusters, nodes, and shards](https://www.elastic.co/docs/deploy-manage/distributed-architecture/clusters-nodes-shards)
|
||||
- [@official@Add and Remove Elasticsearch nodes](https://www.elastic.co/docs/deploy-manage/maintenance/add-and-remove-elasticsearch-nodes)
|
||||
- [@video@Nodes, Clusters & Shards - Elasticsearch 101 Course, Episode 2](https://www.youtube.com/watch?v=sAySPSyL2qE)
|
||||
@@ -0,0 +1,7 @@
|
||||
# Numeric Data Types
|
||||
|
||||
Numeric data types in Elasticsearch are used to store numerical values, such as integers and floating-point numbers. These types allow you to efficiently store and query numerical data, enabling operations like range queries, aggregations, and sorting based on numerical values. Elasticsearch offers various numeric types to optimize storage and performance based on the expected range and precision of your data.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Numeric field types](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/number)
|
||||
@@ -0,0 +1,7 @@
|
||||
# Object Data Type
|
||||
|
||||
An object is a data type that allows you to store nested JSON documents within a single document. This means you can represent complex, hierarchical data structures where a field can contain other fields and their corresponding values, similar to how objects are structured in programming languages. These nested objects can be indexed and searched, enabling you to query based on the properties within the nested structure.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Object field type](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/object)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Optimizing Bulk Indexing
|
||||
|
||||
Bulk indexing in Elasticsearch is the process of sending multiple indexing, updating, or deleting operations in a single request. Optimizing this process involves tuning various parameters and strategies to maximize throughput and minimize resource consumption, ensuring data is efficiently loaded into Elasticsearch. This includes adjusting batch sizes, managing thread pools, and leveraging techniques like request routing and refresh interval adjustments.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Pagination
|
||||
|
||||
Pagination divides search results into discrete pages, allowing users to navigate through large datasets in manageable chunks. Instead of displaying all results at once, which can be overwhelming and resource-intensive, pagination presents a subset of results per page, improving user experience and reducing server load. This involves specifying the starting point (from) and the number of results to return (size) for each page.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Pipeline Aggregations
|
||||
|
||||
Pipeline aggregations in Elasticsearch take the results of other aggregations as their input, allowing you to perform calculations and derive new insights based on the aggregated data. Instead of operating on the documents themselves, they process the output of other aggregations, enabling you to create complex analytical pipelines within your search queries. This allows for calculations like moving averages, derivatives, and cumulative sums to be performed directly within Elasticsearch.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Pivot Transformation
|
||||
|
||||
The pivot transformation in Elasticsearch is a way to reshape your data by aggregating values from one or more fields into columns. It essentially rotates the data, turning unique values in a field into separate fields in the output. This allows you to analyze and visualize data in a different format, making it easier to identify trends and patterns that might be hidden in the original structure.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Prefix Query
|
||||
|
||||
A prefix query finds documents that contain terms starting with a specific prefix. It operates at the term level, meaning it searches for the prefix directly within the indexed terms of a field. This query is useful for implementing features like autocompletion or searching for products based on the beginning of their names.
|
||||
@@ -0,0 +1,11 @@
|
||||
# Primary Shards
|
||||
|
||||
Primary shards are the fundamental units of data storage in Elasticsearch. An index is logically divided into one or more primary shards, each of which contains a portion of the index's data. These shards allow Elasticsearch to distribute data across multiple nodes in a cluster, enabling horizontal scaling and improved performance. The number of primary shards is defined at index creation and determines the maximum level of parallelism for indexing and searching.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Clusters, nodes, and shards](https://www.elastic.co/docs/deploy-manage/distributed-architecture/clusters-nodes-shards)
|
||||
- [@official@Size your shards](https://www.elastic.co/docs/deploy-manage/production-guidance/optimize-performance/size-shards)
|
||||
- [@article@Understanding Shards in Elasticsearch](https://opster.com/guides/elasticsearch/glossary/what-are-shards-in-elasticsearch/)
|
||||
- [@article@Elasticsearch shards and replicas: A practical guide](https://www.elastic.co/search-labs/blog/elasticsearch-shards-and-replicas-guide)
|
||||
- [@video@Nodes, clusters, and shards in Elasticsearch - S1E3:Mini Beginner's Crash Course](https://www.youtube.com/watch?v=9uJNksCj2f8)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Query DSL
|
||||
|
||||
Query DSL (Domain Specific Language) is a JSON-based language used to define and execute search queries in Elasticsearch. It provides a structured way to express complex search criteria, including boolean logic, term matching, range queries, and more, allowing users to precisely specify what data they want to retrieve.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Query
|
||||
|
||||
A query is a request for information from a data source. It specifies the criteria for retrieving specific data that matches the defined conditions. In Elasticsearch, queries are used to search and retrieve documents that match certain criteria within an index. These queries can range from simple keyword searches to complex combinations of filters and conditions.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Range/Date Range Aggregations
|
||||
|
||||
Range and Date Range aggregations are used to categorize documents into buckets based on numeric or date values falling within specified ranges. These aggregations allow you to define custom intervals for grouping data, providing flexibility in analyzing distributions and trends across your dataset. You can define specific start and end points for each range, enabling you to create meaningful segments for your analysis.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Range Query
|
||||
|
||||
A range query allows you to find documents where the value of a specific field falls within a specified range. This range can be defined using upper and lower bounds, which can be inclusive or exclusive. It's useful for filtering data based on numerical values, dates, or even strings that can be lexicographically compared.
|
||||
@@ -0,0 +1,9 @@
|
||||
# Replica Shards
|
||||
|
||||
Replica shards are copies of primary shards within an Elasticsearch index. They provide redundancy, ensuring data availability even if a primary shard fails. Additionally, replica shards serve read requests, distributing the load and improving search performance by allowing Elasticsearch to process queries in parallel across multiple shards.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Reading and writing documents](https://www.elastic.co/docs/deploy-manage/distributed-architecture/reading-and-writing-documents)
|
||||
- [@article@Elasticsearch shards and replicas: A practical guide](https://www.elastic.co/search-labs/blog/elasticsearch-shards-and-replicas-guide)
|
||||
- [@video@Nodes, clusters, and shards in Elasticsearch - S1E3:Mini Beginner's Crash Course](https://www.youtube.com/watch?v=9uJNksCj2f8)
|
||||
@@ -0,0 +1,11 @@
|
||||
# REST API Basics
|
||||
|
||||
REST API (Representational State Transfer Application Programming Interface) is an architectural style for building networked applications. It relies on a stateless, client-server communication protocol, typically HTTP, to perform operations on resources. These operations, often referred to as CRUD (Create, Read, Update, Delete), are executed using standard HTTP methods like GET, POST, PUT, and DELETE, allowing different software systems to interact with each other over a network.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@roadmap@Visit the Dedicated API Design Roadmap](https://roadmap.sh/api-design)
|
||||
- [@article@What is REST API?](http://cloud.google.com/discover/what-is-rest-api?hl=en)
|
||||
- [@article@What is REST API? - IBM](https://www.ibm.com/think/topics/rest-apis)
|
||||
- [@video@What Is REST API? Examples And How To Use It: Crash Course System Design #3](https://www.youtube.com/watch?v=-mN3VyJuCjM)
|
||||
- [@video@What is a REST API?](https://www.youtube.com/watch?v=lsMQRaeKNDk)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Roles & Users
|
||||
|
||||
Roles and users are fundamental components of security in Elasticsearch. Roles define a set of privileges, specifying what actions a user can perform on which resources (like indices or clusters). Users are then assigned one or more roles, granting them the combined permissions of those roles. This system allows administrators to control access to data and cluster operations, ensuring that only authorized individuals can perform specific tasks within the Elasticsearch environment.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Rollover Policies
|
||||
|
||||
Rollover policies in Elasticsearch automate the management of indices over time. They define conditions, such as index size, document count, or age, that trigger the creation of a new index and the transition of write operations to it. This process helps maintain manageable index sizes, optimize search performance, and simplify data retention strategies.
|
||||
@@ -0,0 +1,12 @@
|
||||
# Running Elasticsearch with Docker
|
||||
|
||||
Docker provides a convenient and isolated environment to run applications, including Elasticsearch. Using Docker, you can quickly set up an Elasticsearch instance without worrying about operating system compatibility or dependency conflicts. This involves pulling the official Elasticsearch image from a registry like Docker Hub, configuring the necessary environment variables and port mappings, and then starting the container. This approach simplifies deployment and ensures consistency across different environments.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@Install Elasticsearch with Docker](https://www.elastic.co/docs/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker)
|
||||
- [@official@Start a single-node cluster in Docker](https://www.elastic.co/docs/deploy-manage/deploy/self-managed/install-elasticsearch-docker-basic)
|
||||
- [@article@Elastic Stack with Docker getting started. Elasticsearch, Kibana, and Filebeat.](https://medium.com/@vosarat1995/elastic-stack-with-docker-getting-started-elasticsearch-kibana-and-filebeat-ebe75fd13041)
|
||||
- [@article@A beginner's guide to running Elasticsearch with Docker and Docker Compose](https://geshan.com.np/blog/2023/06/elasticsearch-docker/)
|
||||
- [@roadmap@Visit the Dedicated Docker Roadmap](https://roadmap.sh/docker)
|
||||
- [@video@How to Install Elasticsearch using Docker - Step by Step Guide](https://www.youtube.com/watch?v=p9IWwTDHgcU)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Search Analyzer
|
||||
|
||||
A search analyzer in Elasticsearch is responsible for processing the query text provided by a user before it's used to search the index. It transforms the query text into a format that matches the indexed data, ensuring relevant results are retrieved. This process typically involves character filtering, tokenization, and token filtering, similar to the analysis process performed on documents during indexing, but tailored for search queries.
|
||||
@@ -0,0 +1,10 @@
|
||||
# Search Engines vs. Relational Databases
|
||||
|
||||
Search engines are designed for quickly finding relevant information within large volumes of unstructured or semi-structured text, prioritizing speed and relevance scoring. Relational databases, on the other hand, are structured systems optimized for managing and querying structured data with strong consistency and transactional integrity, using predefined schemas and relationships between tables.
|
||||
|
||||
Visit the following resources to learn more:
|
||||
|
||||
- [@official@What is a search engine?](https://www.elastic.co/what-is/search-engine)
|
||||
- [@article@A Guide to Search Engine Databases](https://www.influxdata.com/search-engine-database/)
|
||||
- [@article@Full Text Search Engines vs. DBMS](https://lucidworks.com/blog/full-text-search-engines-vs-dbms)
|
||||
- [@roadmap@Visit the Dedicated SQL Roadmap](https://roadmap.sh/sql)
|
||||
@@ -0,0 +1,3 @@
|
||||
# Segment Merging
|
||||
|
||||
Segment merging is the process of combining multiple smaller segments in an Elasticsearch index into larger segments. This optimization reduces the number of segments the search engine needs to consult during a query, leading to faster search performance and more efficient resource utilization. The process involves reading the data from the smaller segments, merging them, and writing the merged data into a new, larger segment.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Semantic Search
|
||||
|
||||
Semantic search aims to improve search accuracy by understanding the intent and contextual meaning of search queries. Instead of relying solely on keyword matching, it analyzes the relationships between words and concepts to deliver more relevant results. This involves using techniques like natural language processing (NLP) and machine learning to interpret the meaning behind the query and match it with documents that have similar meaning, even if they don't contain the exact keywords.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Should Query
|
||||
|
||||
The `should` query is a boolean query that returns documents matching one or more of its sub-queries. It increases the relevance score for each matching clause, but doesn't require any clauses to match for a document to be included in the results. If no other boolean queries like `must` or `filter` are present, at least one `should` clause must match.
|
||||
@@ -0,0 +1,3 @@
|
||||
# SLM
|
||||
|
||||
Snapshot Lifecycle Management (SLM) provides a way to automate the creation, retention, and deletion of Elasticsearch snapshots. It allows you to define policies that specify when snapshots should be taken, how long they should be kept, and how they should be named, ensuring consistent and reliable backups of your Elasticsearch data.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Snapshots and Restores
|
||||
|
||||
Snapshots are backups of your Elasticsearch cluster's data and state, stored in a repository. Restoring from a snapshot allows you to recover data in case of failure, corruption, or accidental deletion. This mechanism provides a way to revert your cluster to a previous point in time, ensuring data safety and disaster recovery capabilities.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Sorting
|
||||
|
||||
Sorting in Elasticsearch lets you order the search results based on the values of specific fields. By default, Elasticsearch sorts results by relevance score, but you can change this to sort by other criteria like date, price, or any other field in your documents. This allows you to present the most relevant or useful information to users based on their specific needs, such as showing the newest products first or listing items from lowest to highest price.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Source Filtering
|
||||
|
||||
Source filtering in Elasticsearch allows you to control which fields are returned in the `_source` field of your search results. Instead of retrieving the entire document, you can specify which fields you need, reducing network traffic and improving performance. This is achieved by including or excluding specific fields based on patterns or exact names.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Elasticsearch SQL
|
||||
|
||||
Elasticsearch SQL allows you to query Elasticsearch data using the familiar SQL syntax. Instead of using Elasticsearch's native query DSL (Domain Specific Language), you can write SQL statements to retrieve, filter, and aggregate data stored in Elasticsearch indices. This provides a more accessible way for users familiar with SQL to interact with Elasticsearch, enabling them to leverage their existing skills to analyze and extract insights from their data.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Standard Analyzer
|
||||
|
||||
The Standard Analyzer is a default text analyzer in Elasticsearch that breaks text into individual words based on whitespace and punctuation. It also converts all terms to lowercase and removes common English stop words like "the," "a," and "is." This analyzer is a good general-purpose choice for many text indexing and searching tasks.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Stats and Extended Stats Aggregations
|
||||
|
||||
Stats and Extended Stats aggregations are used to calculate various statistical measures from a set of numeric values. The Stats aggregation provides basic statistics like count, min, max, average, and sum. The Extended Stats aggregation builds upon this by adding standard deviation, sum of squares, variance, and other related metrics, offering a more comprehensive statistical overview of the data.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Synonyms Graph
|
||||
|
||||
Synonyms Graph is a feature in Elasticsearch that allows you to expand your search queries by including words or phrases that have similar meanings. Instead of just searching for the exact terms entered by a user, Elasticsearch can also search for related terms defined as synonyms, improving the recall of search results. The "graph" aspect refers to how these synonyms are represented internally, allowing for more complex relationships between terms, including multi-word synonyms and different synonym types.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Term Query
|
||||
|
||||
A term query is a simple search that looks for documents containing an exact, unanalyzed term in a specific field. It's like searching for a specific word or value without any stemming, synonyms, or other text processing applied. This query is useful when you know the precise value you're looking for and want to find documents that contain it exactly as is.
|
||||
@@ -0,0 +1,3 @@
|
||||
# Terms Aggregation
|
||||
|
||||
The Terms aggregation is a multi-bucket aggregation that groups documents based on the terms found in a specific field. It analyzes the field's values and creates buckets for each unique term, counting the number of documents that contain that term. This allows you to identify the most frequent terms within your data and gain insights into the distribution of values in a field.
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user