Elasticsearch权威指南

1：ES是一个实时分布式搜索和分析引擎，用于全文搜索、结构化搜索、分析以及三者的混合使用，使之快速的大数据处理成为了可能。

2：elasticsearch和kibana版本都使用的6.4（es的版本至少要比kibana高）

启动es
启动kibana
http://localhost:5601/ 进入kibana监控管理界面

3：概念

索引：相当于RDBMS中的数据库。
类型：相当于RDBMS中的表。
文档：相当于RDBMS中的行。

4：创建索引

PUT company/employee/1
{
"firstname":"John",
"lastname":"smith",
"age":25,
"about":"I love to rock climing",
"interests":["sports","music"]
}

5：检索文档

GET company/employee/2

6:简单搜索(query string)

GET company/employee/_search/?q=firstname:john

7:query DSL查询

GET /company/employee/_search
{
"query": {
"match": {
"lastname": "smith"
}
},
"from": 1,
"size": 1
}

8:全文搜索

GET /company/employee/_search
{
"query": {
"match": {
"about": "rock"
}
}
}

9:确切值搜索

GET /company/employee/_search
{
"query": {
"match_phrase": {
"firstname": "John2"
}
}
}

10:高亮搜索

GET /company/employee/_search
{
"query": {
"match_phrase": {
"about": "rock climing"
}
},
"highlight": {
"fields": {
"about": {}
}
}
}

11:聚合（允许分级汇总聚合）

GET /company/employee/_search
{
"aggs": {
"hello_aggs": {
"terms": {
"field": "age",
"size": 10
}
}
}
}

12:集群健康

GET _cluster/health

green:particion和replica都可用
yellow：particion可用，部分replica不可用
red：部分particion不可用

13:文档元数据

_index:索引名
_type:索引类型
_id:文档ID

14:查看所有索引

GET _cat/indices?v

15:删除索引

DELETE company

16:索引一个文档

POST website/blog
{
"title":"hello world",
"content":"this is content",
"count":33
}
思考：如果指定ID可以使用put，否则只能用post

17:检索文档（ID要写，否则无法确哪个认文档）

GET /hello/nn/1

18:检索文档的一部分

GET /test/hello/1/_source

GET /test/hello/1?_source=title

19:检索文档是否存在

输入：

curl -i -XHEAD http://localhost:9200/test/hello/1

HTTP/1.1 200 OK

content-type: application/json; charset=UTF-8

content-length: 117

输入：

curl -i -XHEAD http://localhost:9200/test/hello/2

HTTP/1.1 404 Not Found

content-type: application/json; charset=UTF-8

content-length: 57

20:更新整个文档

PUT /test/hello/1
{
"title":11,
"age":2
}

思考：该命令可以重置之前的文档，添加或删除字段，但是如果沿用之前的字段，则必须要符合索引类型mapping的设置。

21:创建一个新的文档

方法一：

POST /test/hello/
{
"nnn":1
}

方法二：

PUT /test/hello/1/_create
{
"nnn":1
}

如果当前文档已经存在，则创建文档失败。

方法三：

PUT /test/hello/3/?op_type=create
{
"nnn":1
}

如果当前文档已经存在，则创建文档失败。

22:处理冲突

悲观并发控制（加锁）
乐观并发控制（采用版本号，如果当前版本号低于节点版本号，请求失败，可以考虑重新请求）

23:文档局部更新

POST /test/hello/3/_update
{
"doc": {
"hhh":1
}
}

思考：替换旧有的字段，添加新的字段。

24:使用脚本局部更新(略)

25:更新和冲突

POST /test/hello/3/_update?retry_on_conflict=2
{
"doc": {
"title":"111"
}
}

25:检索多个文档

方法一：

GET /test/_mget
{
"docs":[
{
"_type":"hello",
"_id":1
},
{
"_type":"hello",
"_id":2
}
]
}

方法二：

GET /test/hello/_mget
{
"ids":[1,2,3]
}

26：批量操作

｛action:{metadata}｝\n

{request body}\n

action type:

index: 如果存在则更新，否则创建
create:创建
update:更新
delete:删除

POST /test/hello/_bulk
{"create":{"_id":4}}
{"hello":"fourth document"}
{"index":{"_id":3}}
{"hello":"third docment"}

27:路由文档到分片

shard = hash(routing) % number_of_primary_shards

28:新建、索引、删除文档流程（略，没什么复杂的）

29:搜索

GET /test/hello/_search?from=1&size=3

思考：不要深度分页，每个节点都会有一个from＋size的缓冲区，如果深度分页的话性能会下降。

30:简易搜索（query string）

GET /test/hello/_search?q=hhh:111

31:确切值和全文文本（概念，略）

32:分析和分析器

字符过滤器：char_filter
分词器：tokenizer
表征过滤器：filter

33:查看映射

GET /test/_mapping/hello

34:string类型

index：analyzed （全文文本索引）
index：not－analyzed（确切值索引）
index：no（不索引）

35:创建索引（映射）

PUT /test
{
"mappings":{
"nihao":{
"properties":{
"title":{
"type":"string"
}
}
}
}
}

36:内部对象是怎么被索引的

Elasticsearch权威指南

37:结构化查询（query dsl）

GET /test/hello/_search
{
"query": {
"match_all": {

}
}
}

38:查询与过滤（查询需要计算相关性）

39:最重要的查询过滤语句

term过滤
terms过滤
range过滤
exists过滤
missing过滤
bool过滤
match_all查询
match查询
multi_match查询
bool查询

40:查询和过滤条件组合(P102，测试好像有问题)

41：排序

GET /test/hello/_search
{
"sort": [
{
"age": {
"order": "desc"
}
}
]
}

42:相关性简介

检索词频率：检索词在该字段出现的频率，频率越高，相关性越高
反向文档频率：检索词在索引中出现的频率，频率越高，相关性越低
字段长度准则：字段越长，相关性越低

43:分布式搜索执行方式

查询阶段
取回阶段

44:搜索选项

preference
timeout
routing
search_type

45:扫描和滚屏（略）

46:创建自定义分析器

Elasticsearch权威指南

书籍：《Elasticsearch权威指南》

工具：

elasticsearch：elasticsearch-6.4.0

kibana：kibana-6.4.0-darwin-x86_64

Elasticsearch权威指南

相关推荐