Introduction to Elasticsearch Search Engine: From Index Creation to Query Optimization
发布时间: 2024-09-13 20:13:30 阅读量: 27 订阅数: 23
# Introduction to Elasticsearch: From Index Creation to Query Optimization
Elasticsearch is an open-source distributed search and analytics engine based on Apache Lucene, featuring the following key characteristics:
***Distributed Architecture:** Elasticsearch can scale horizontally across multiple nodes, offering high availability and scalability.
***Real-time Indexing:** Elasticsearch allows immediate search upon indexing documents, enabling a near real-time search experience.
***Full-text Search:** Elasticsearch supports full-text search on text fields and provides a rich query syntax and filtering options.
***Aggregations and Analytics:** Elasticsearch offers powerful aggregation and analytics capabilities, enabling grouping, counting, and statistical analysis of data.
# 2. Elasticsearch Data Model
### 2.1 Documents and Fields
Data in Elasticsearch is stored in JSON objects called **documents**. Each document contains one or more **fields**, which are specific attributes or values within the document. Fields can be of various data types, including strings, numbers, dates, booleans, and objects.
**Example Document:**
```json
{
"title": "Elasticsearch Getting Started Guide",
"author": "John Doe",
"date": "2023-03-08",
"content": "This is an article about the Elasticsearch getting started guide."
}
```
### 2.2 Indexes and Types
An **index** is a logical container for storing documents in Elasticsearch. It is similar to a table in a relational database, but more flexible as it allows documents to have different structures. Each index is identified by a **name** and can contain multiple **types**.
A **type** is a logical grouping of documents within an index. It is similar to a column in a relational database but more flexible as it allows documents to have different sets of fields. Each type is identified by a **name** and can contain documents with different structures.
**Example Index and Types:**
* Index: `articles`
* Types: `article`, `author`
### 2.2.1 Document ID and Source
Each document has a unique **document ID** for identification. The document ID is auto-generated by Elasticsearch, but can also be manually specified.
The **source** of a document is its original JSON representation. It contains all fields and values of the document.
### 2.2.2 Mappings
**Mappings** define the structure of documents within an index. They specify the name, data type, and other attributes of each field. Mappings are defined when an index is created but can be changed later.
**Example Mapping:**
```json
{
"mappings": {
"article": {
"properties": {
"title": { "type": "text" },
"author": { "type": "keyword" },
"date": { "type": "date" },
"content": { "type": "text" }
}
}
}
}
```
### 2.2.3 Index Lifecycle
The index lifecycle includes the following stages:
***Creation:** When an index is created, Elasticsearch defines its structure based on the specified mappings.
***Write:** Documents can be added to the index, and Elasticsearch validates and indexes them according to the mappings.
***Refresh:** The refresh operation writes uncommitted documents to disk, making them searchable.
***Commit:** The commit operation makes refreshed documents persistent on disk, making them permanently available.
***Close:** After an index is closed, it no longer accepts new documents but can still be searched.
***Delete:** Once an index is deleted, it is permanently removed from Elasticsearch.
### Code Examples
**Creating Index and Mappings:**
```python
from elasticsearch import Elasticsearch
es = Elasticsearch()
# Create index
es.indices.create(index="articles", body={
"mappings": {
"article": {
"properties": {
"title": { "type": "text" },
"author": { "type": "keyword" },
"date": { "type": "date" },
"content": { "type": "text" }
}
}
}
})
```
**Adding a Document:**
```python
# Add document
es.index(index="articles", id=1, body={
"title": "Elasticsearch Getting Started Guide",
"author": "John Doe",
"date": "2023-03-08",
"content": "This is an article about the Elasticsearch getting started guide."
})
```
**Searching for Documents:**
```python
# Search for documents
res
```
0
0