博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
微服务进阶整合学习高级篇--ElasticSearch
阅读量:2047 次
发布时间:2019-04-28

本文共 36891 字,大约阅读时间需要 122 分钟。

微服务进阶整合学习高级篇--ElasticSearch

搜索分析引擎ElasticSearch

ElasticSearch三大基本概念

1. index-索引

EL中,index索引区别与Mysql中的索引。EL建立一个索引等同于在Mysql数据库中建立一个数据库。动词:相当于mysql的insert;名词:相当于mysql的db

2. type-类型

type在EL中等同与在Mysql中数据库的某一个表。

3. document-文档

document在EL中等同于在Mysql中的一条条的数据,不同的是这些数据是没有像mysql那样具有固定的列属性的。

  1. mysql与el的对应关系,如下图表格所示:
Mysql ElasticSearch
DataBase Index
Table Type
Data Document

docker安装ElasticSearch

1. 切换到管理员su root下载docker或者在命令前加上sudo

(sudo) docker pull elasticsearch:7.4.2(sudo) docker pull kibana:7.4.2

2. free -m查看内存使用情况

3. 创建目录供后续进行磁盘挂载

#mkdir -p递归创建目录mkdir -p /mydata/elasticsearch/configmkdir -p /mydata/elasticsearch/data

4. 创建配置文件并写入数据

# es可以被远程任何机器访问echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml

5. 进行磁盘挂载

# -e指定是单阶段运行# -e指定占用的内存大小docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \-e  "discovery.type=single-node" \-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \-d elasticsearch:7.4.2 # 设置开机启动elasticsearchdocker update elasticsearch --restart=always

6. 赋予EL权限

# 递归更改权限,es需要访问chmod -R 777 /mydata/elasticsearch/

7. 访问端口查看是否配置成功

http://192.168.56.10:9200

8. 安装启动kibana

# kibana指定了了ES交互端口9200# 5600是kibana主页端口docker run --name kibana -e ELASTICSEARCH_HOSTS=http://192.168.56.10:9200 -p 5601:5601 -d kibana:7.4.2

9. 访问kibana首页

http://192.168.56.10:5601

ElasticSearch基本入门

1. _cat命令

http://192.168.56.10:9200/_cat/nodes--查看所有的节点http://192.168.56.10:9200/_cat/health--查看EL健康情况http://192.168.56.10:9200/_cat/master--查看主节点信息http://192.168.56.10:9200/_cat/indices--查看所有数据库

2. put/post请求新增

post请求http://192.168.56.10:9200/customer/external/1json:{
"name":"test"}put请求http://192.168.56.10:9200/customer/external/1json:{
"name":"test"}

两者本质效果都一样,意思是往customer数据库中的external表插入id为1的数据,具体数据为json格式的数据,但是post和put又有区别

post put
可以不写id,自动生成id 必须写id
id不存在则插入,存在则更新 必须写id
用于新增 可用于新增或者修改
版本号不一定会增加 版本号version一定会增加

3. get请求查看文档

get请求https://ElasticSearch地址与端口/索引名/类型名/id主键http://192.168.56.10:9200/customer/external/1

在这里插入图片描述

4. put请求进行更新

可以通过版本号来对数据进行修改,其中_seq_no是并发控制字段,每次更新都会+1。_primary_term是主分片重新分配,如重启,就会变化

在这里插入图片描述
版本号不一致发生的更新错误情况。
在这里插入图片描述

5. post请求+_update命令更新文档

POST customer/externel/1/_update   ---后面跟了_update一定要带doc,不然报错,比较内容如果一致,版本号不会变化  {
"doc":{
"name":"111" }}或者POST customer/externel/1 --doc可带可不带,版本号一定会变化{
"doc":{
"name":"222" }}或者PUT customer/externel/1 ---put版本号会变化{
"doc":{
"name":"222" }}或者PUT customer/externel/1 ---put版本号会变化{
"name":"222"}

6. 删除操作

# ElasticSearch只能删库(Index)或者删数据(document),不能删除表(type)DELETE customer/external/1DELETE customer

7. 批量操作_bulk指令

语法格式

#一行为一次操作,该操作只能在kibana上执行,post满无法识别换行#对整个es实体进行操作POST /_bulk{
"delete":{
"_index":"website","_type":"blog","_id":"123"}}{
"create":{
"_index":"website","_type":"blog","_id":"123"}}{
"title":"my first blog post"}{
"index":{
"_index":"website","_type":"blog"}}{
"title":"my second blog post"}{
"update":{
"_index":"website","_type":"blog","_id":"123"}}{
"doc":{
"title":"my updated blog post"}}

8. 测试数据

ES进阶

1.

2. QueryDSL格式查询

#假如数据量够多,默认返回10条数据,不回全部返回所有数据。GET /bank/_search{
"query": {
"match_all": {
} #查询的字段,为空则表示查询所有 }, "from": 0, #从第几条开始查 "size": 5, #查询多少条记录 "_source":["balance"], #要返回的字段 "sort": [ #排序 {
"account_number": {
#按照哪个字段排序 "order": "desc" #排序的顺序 } } ]}

3. query/match匹配查询

  1. 当匹配的类型为非字符串的时候,进行精确匹配,相当于mysql中的 where xxx = xxx
GET bank/_search{
"query": {
"match": {
"account_number": "20" # 字段名:值 } }}#返回数据{
"took" : 14, "timed_out" : false, "_shards" : {
"total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {
"total" : {
"value" : 1, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ {
"_index" : "bank", "_type" : "account", "_id" : "20", "_score" : 1.0, "_source" : {
"account_number" : 20, "balance" : 16418, "firstname" : "Elinor", "lastname" : "Ratliff", "age" : 36, "gender" : "M", "address" : "282 Kings Place", "employer" : "Scentric", "email" : "elinorratliff@scentric.com", "city" : "Ribera", "state" : "WA" } } ] }}
  1. 当匹配的类型为字符串的时候,进行全文检索,相当于mysql中的where xxx like ‘%xxx%’。会进行分词查询
GET bank/_search{
"query": {
"match": {
"address": "kings" } }}#返回结果{
"took" : 23, "timed_out" : false, "_shards" : {
"total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {
"total" : {
"value" : 2, "relation" : "eq" }, "max_score" : 5.9908285, "hits" : [ {
"_index" : "bank", "_type" : "account", "_id" : "20", "_score" : 5.9908285, "_source" : {
"account_number" : 20, "balance" : 16418, "firstname" : "Elinor", "lastname" : "Ratliff", "age" : 36, "gender" : "M", "address" : "282 Kings Place", "employer" : "Scentric", "email" : "elinorratliff@scentric.com", "city" : "Ribera", "state" : "WA" } }, {
"_index" : "bank", "_type" : "account", "_id" : "722", "_score" : 5.9908285, "_source" : {
"account_number" : 722, "balance" : 27256, "firstname" : "Roberts", "lastname" : "Beasley", "age" : 34, "gender" : "F", "address" : "305 Kings Hwy", "employer" : "Quintity", "email" : "robertsbeasley@quintity.com", "city" : "Hayden", "state" : "PA" } } ] }}

4. query/match_phrase [不拆分匹配]

  1. 该匹配检索的关键词不会进行分词处理
GET bank/_search{
"query": {
"match_phrase": {
"address": "mill road" #将mill road当成一个搜索关键词进行处理,而不是进行分词成mill || road进行查询 #相当于mysql中的where xxx like } }}
  1. 或者通过字段.keyword进行全匹配查询,该查询区分大小写,相当于mysql中where xxx = xxx
GET bank/_search{
"query": {
"match": {
"address.keyword": "mill road" } }}
query/match/字段 query/match_phrase/字段 query/match/字段.keyword
分词 不分词 不分词
模糊查询 模糊查询 全匹配查询

5. query/multi_math[多字段匹配]

  1. 相当于指定mysql中某一列的条件,如:select from xxx where xxx = xxx
    and xxx like xxx and xxx like xxx。该查询会进行分词处理。
GET bank/_search{
"query": {
"multi_match": {
# 前面的match仅指定了一个字段。 "query": "mill road", # 要检索的关键词,会进行分词处理 "fields": [ # 在state,address里进行检索。要求state和address有mill子串,不要求都有 "state", "address" ] } }}#返回结果{
"took" : 4, "timed_out" : false, "_shards" : {
"total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {
"total" : {
"value" : 32, "relation" : "eq" }, "max_score" : 8.926605, "hits" : [ {
"_index" : "bank", "_type" : "account", "_id" : "970", "_score" : 8.926605, "_source" : {
"account_number" : 970, "balance" : 19648, "firstname" : "Forbes", "lastname" : "Wallace", "age" : 28, "gender" : "M", "address" : "990 Mill Road", "employer" : "Pheast", "email" : "forbeswallace@pheast.com", "city" : "Lopezo", "state" : "AK" } }, {
"_index" : "bank", "_type" : "account", "_id" : "136", "_score" : 5.4032025, "_source" : {
"account_number" : 136, "balance" : 45801, "firstname" : "Winnie", "lastname" : "Holland", "age" : 38, "gender" : "M", "address" : "198 Mill Lane", "employer" : "Neteria", "email" : "winnieholland@neteria.com", "city" : "Urie", "state" : "IL" } }, {
"_index" : "bank", "_type" : "account", "_id" : "345", "_score" : 5.4032025, "_source" : {
"account_number" : 345, "balance" : 9812, "firstname" : "Parker", "lastname" : "Hines", "age" : 38, "gender" : "M", "address" : "715 Mill Avenue", "employer" : "Baluba", "email" : "parkerhines@baluba.com", "city" : "Blackgum", "state" : "KY" } }, {
"_index" : "bank", "_type" : "account", "_id" : "472", "_score" : 5.4032025, "_source" : {
"account_number" : 472, "balance" : 25571, "firstname" : "Lee", "lastname" : "Long", "age" : 32, "gender" : "F", "address" : "288 Mill Street", "employer" : "Comverges", "email" : "leelong@comverges.com", "city" : "Movico", "state" : "MT" } }, {
"_index" : "bank", "_type" : "account", "_id" : "431", "_score" : 3.5234027, "_source" : {
"account_number" : 431, "balance" : 13136, "firstname" : "Laurie", "lastname" : "Shaw", "age" : 26, "gender" : "F", "address" : "263 Aviation Road", "employer" : "Zillanet", "email" : "laurieshaw@zillanet.com", "city" : "Harmon", "state" : "WV" } }, {
"_index" : "bank", "_type" : "account", "_id" : "436", "_score" : 3.5234027, "_source" : {
"account_number" : 436, "balance" : 27585, "firstname" : "Alexander", "lastname" : "Sargent", "age" : 23, "gender" : "M", "address" : "363 Albemarle Road", "employer" : "Fangold", "email" : "alexandersargent@fangold.com", "city" : "Calpine", "state" : "OR" } }, {
"_index" : "bank", "_type" : "account", "_id" : "532", "_score" : 3.5234027, "_source" : {
"account_number" : 532, "balance" : 17207, "firstname" : "Hardin", "lastname" : "Kirk", "age" : 26, "gender" : "M", "address" : "268 Canarsie Road", "employer" : "Exposa", "email" : "hardinkirk@exposa.com", "city" : "Stouchsburg", "state" : "IL" } }, {
"_index" : "bank", "_type" : "account", "_id" : "873", "_score" : 3.5234027, "_source" : {
"account_number" : 873, "balance" : 43931, "firstname" : "Tisha", "lastname" : "Cotton", "age" : 39, "gender" : "F", "address" : "432 Lincoln Road", "employer" : "Buzzmaker", "email" : "tishacotton@buzzmaker.com", "city" : "Bluetown", "state" : "GA" } }, {
"_index" : "bank", "_type" : "account", "_id" : "83", "_score" : 3.5234027, "_source" : {
"account_number" : 83, "balance" : 35928, "firstname" : "Mayo", "lastname" : "Cleveland", "age" : 28, "gender" : "M", "address" : "720 Brooklyn Road", "employer" : "Indexia", "email" : "mayocleveland@indexia.com", "city" : "Roberts", "state" : "ND" } }, {
"_index" : "bank", "_type" : "account", "_id" : "88", "_score" : 3.5234027, "_source" : {
"account_number" : 88, "balance" : 26418, "firstname" : "Adela", "lastname" : "Tyler", "age" : 21, "gender" : "F", "address" : "737 Clove Road", "employer" : "Surelogic", "email" : "adelatyler@surelogic.com", "city" : "Boling", "state" : "SD" } } ] }}

6. query/bool/must复合查询

  1. must:必须达到must所列举的所有条件
  2. must_not:必须不匹配must_not所列举的所有条件。
  3. should:应该满足should所列举的条件。满足条件最好,不满足也可以,满足得分更高
GET bank/_search{
"query":{
"bool":{
"must":[ # 必须满足address=mill,gender=M {
"match":{
"address":"mill"}}, {
"match":{
"gender":"M"}} ], "must_not":[ # 不可以满足age=38 {
"match":{
"age": "38" }} ], "should": [ # 应该满足的条件,不会改变结果,但当条件满足,该记录得分会更高 {
"match": {
"lastname": "Wallace"}} ] } }}

7. query/filter[结果过滤]

  1. 对查询的结果进行过滤处理
GET bank/_search{
"query": {
"bool": {
"must": [ {
"match": {
"address": "mill" } } # 会进行分词处理 ], "filter": {
"range": {
# 区间过滤 "balance": {
# 过滤的字段 "gte": "10000", "lte": "20000" } } } } }}#返回结果{
"took" : 0, "timed_out" : false, "_shards" : {
"total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {
"total" : {
"value" : 1, "relation" : "eq" }, "max_score" : 5.4032025, "hits" : [ {
"_index" : "bank", "_type" : "account", "_id" : "970", "_score" : 5.4032025, "_source" : {
"account_number" : 970, "balance" : 19648, "firstname" : "Forbes", "lastname" : "Wallace", "age" : 28, "gender" : "M", "address" : "990 Mill Road", "employer" : "Pheast", "email" : "forbeswallace@pheast.com", "city" : "Lopezo", "state" : "AK" } } ] }}
  1. 和must_not进行比较,两者在作用上似乎有异曲同工之妙。且must_not和filter是不对评分结果进行影响的。

8. query/term检索非text字段

  1. 同query/match一样,都是检索匹配某个属性的值。不同的是,query/term主要用于检索非text字段
GET bank/_search{
"query": {
"term": {
"address": "mill Road" } }}#返回结果同样的查询方式,不同于query/match,使用query/term进行检索,返回的结果条数为0.{
"took" : 0, "timed_out" : false, "_shards" : {
"total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {
"total" : {
"value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }}

9. aggs/agg1聚合检索

  1. 通过聚合函数求出包含Mill的,且根据年龄段进行聚合,求出其聚合的平均年龄、可能性分布等。
# 分别为包含mill、,平均年龄、GET bank/_search{
"query": {
# 查询出包含mill的 "match": {
"address": "Mill" } }, "aggs": {
# 聚合查询 "ageAgg": {
# 自定义名字 "terms": {
# terms聚合函数,看值的可能性分布 "field": "age", "size": 10 } }, "ageAvg": {
# 自定义名字 "avg": {
# avg聚合函数,求平均值 "field": "age" # 聚合的字段 } }, "balanceAvg": {
# 自定义名字 "avg": {
# 看balance的平均 "field": "balance" # 聚合的字段 } } }, "size": 0 # 不看详情}#返回结果{
"took" : 14, "timed_out" : false, "_shards" : {
"total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {
"total" : {
"value" : 4, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : {
# 聚合结果 "ageAgg" : {
# 名字叫ageAgg的聚合结果 "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ {
"key" : 38, # 年龄为38的有两个人 "doc_count" : 2 }, {
"key" : 28, "doc_count" : 1 }, {
"key" : 32, "doc_count" : 1 } ] }, "ageAvg" : {
# 满足条件的人群平均年龄为34 "value" : 34.0 }, "balanceAvg" : {
# 人群平均薪资 "value" : 25208.0 } }}

10. aggs子聚合查询

  1. 按照年龄聚合,并且求这些年龄段的这些人的平均薪资
GET bank/_search{
"query": {
"match_all": {
} }, "aggs": {
"ageAgg": {
"terms": {
# 看分布,统计各个年龄段的人数 "field": "age", "size": 100 }, "aggs": {
# 与terms并列,根据上个聚合返回的结果作为条件继续聚合出 #每个年龄段的人均薪资水平 "ageAvg": {
#平均 "avg": {
"field": "balance" } } } } }, "size": 0 # 不查看详情}#返回结果"aggregations" : {
"ageAgg" : {
"doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ {
"key" : 31, "doc_count" : 61, "ageAvg" : {
"value" : 28312.918032786885 } }, {
"key" : 39, "doc_count" : 60, "ageAvg" : {
"value" : 25269.583333333332 } }, {
"key" : 26, "doc_count" : 59, "ageAvg" : {
"value" : 23194.813559322032 } }, {
"key" : 32, "doc_count" : 52, "ageAvg" : {
"value" : 23951.346153846152 } }, {
"key" : 35, "doc_count" : 52, "ageAvg" : {
"value" : 22136.69230769231 } }, {
"key" : 36, "doc_count" : 52, "ageAvg" : {
"value" : 22174.71153846154 } }, {
"key" : 22, "doc_count" : 51, "ageAvg" : {
"value" : 24731.07843137255 } }, {
"key" : 28, "doc_count" : 51, "ageAvg" : {
"value" : 28273.882352941175 } }, {
"key" : 33, "doc_count" : 50, "ageAvg" : {
"value" : 25093.94 } }, {
"key" : 34, "doc_count" : 49, "ageAvg" : {
"value" : 26809.95918367347 } }, {
"key" : 30, "doc_count" : 47, "ageAvg" : {
"value" : 22841.106382978724 } }, {
"key" : 21, "doc_count" : 46, "ageAvg" : {
"value" : 26981.434782608696 } }, {
"key" : 40, "doc_count" : 45, "ageAvg" : {
"value" : 27183.17777777778 } }, {
"key" : 20, "doc_count" : 44, "ageAvg" : {
"value" : 27741.227272727272 } }, {
"key" : 23, "doc_count" : 42, "ageAvg" : {
"value" : 27314.214285714286 } }, {
"key" : 24, "doc_count" : 42, "ageAvg" : {
"value" : 28519.04761904762 } }, {
"key" : 25, "doc_count" : 42, "ageAvg" : {
"value" : 27445.214285714286 } }, {
"key" : 37, "doc_count" : 42, "ageAvg" : {
"value" : 27022.261904761905 } }, {
"key" : 27, "doc_count" : 39, "ageAvg" : {
"value" : 21471.871794871793 } }, {
"key" : 38, "doc_count" : 39, "ageAvg" : {
"value" : 26187.17948717949 } }, {
"key" : 29, "doc_count" : 35, "ageAvg" : {
"value" : 29483.14285714286 } } ] } }
  1. 对所有数据根据年龄进行分类聚合,在次基础上,对不同年龄段男女数量进行分类聚合,并求出男女分类中的平均薪资是多少
GET bank/_search{
"query": {
"match_all": {
} }, "aggs": {
"ageAgg": {
"terms": {
# 看age分布 "field": "age", "size": 100 }, "aggs": {
# 子聚合 "genderAgg": {
"terms": {
# 看gender分布 "field": "gender.keyword" # 注意这里,文本字段应该用.keyword }, "aggs": {
# 子聚合 "balanceAvg": {
"avg": {
# 男性的平均 "field": "balance" } } } }, "ageBalanceAvg": {
"avg": {
#age分布的平均(男女) "field": "balance" } } } } }, "size": 0}#返回结果"aggregations" : {
"ageAgg" : {
"doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ {
"key" : 31, "doc_count" : 61, "genderAgg" : {
"doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ {
"key" : "M", "doc_count" : 35, "balanceAvg" : {
"value" : 29565.628571428573 } }, {
"key" : "F", "doc_count" : 26, "balanceAvg" : {
"value" : 26626.576923076922 } } ] }, "ageBalanceAvg" : {
"value" : 28312.918032786885 } } ] } }

Mapping字段映射

1. 查看映射

  1. GET [index]/_mapping

2. 映射字段匹配类型

  1. Mapping匹配索引

3. 创建索引(数据库)并配置映射(字段类型)

PUT /my_index{
"mappings": {
"properties": {
"age": {
"type": "integer" }, "email": {
"type": "keyword" # 指定为keyword }, "name": {
"type": "text" # 全文检索。保存时候分词,检索时候进行分词匹配 } } }}

4. 在原有索引上添加新的字段映射

PUT /my_index/_mapping{
"properties": {
"employee-id": {
"type": "keyword", "index": false # 字段不能被检索。 } }}

5. 映射更新

  1. 映射不能被更新`,若要更新映射,只能做数据迁移
  2. 将原来测试数据迁移到新的数据,去除type(9.0版本移除type类型)
#1、创建新的索引,去除account属性PUT /newbank{
"mappings": {
"properties": {
"account_number": {
"type": "long" }, "address": {
"type": "text" }, "age": {
"type": "integer" }, "balance": {
"type": "long" }, "city": {
"type": "keyword" }, "email": {
"type": "keyword" }, "employer": {
"type": "keyword" }, "firstname": {
"type": "text" }, "gender": {
"type": "keyword" }, "lastname": {
"type": "text", "fields": {
"keyword": {
"type": "keyword", "ignore_above": 256 } } }, "state": {
"type": "keyword" } } }}#2、数据迁移,将bank数据迁移到newbank中,去除type属性POST _reindex{
"source": {
"index": "bank", "type": "account" }, "dest": {
"index": "newbank" }}

IK分词器

1. docker安装

  1. 由于前面我们已经将elasticsearch内部文件挂载到linux虚拟机上,因此我们只需要在/mydata/elasticsearch/plugins目录下安装kibana分词器即可。
  2. (如果虚拟机连接不到外网,ping不通外部网页,可以使用xshell进行拷贝复制。)
  3. (如果虚拟机可以链接到外网,我们只需要通过执行yum install wget安装wget,再通过wget 资源地址即可获取下载安装包,下载完成后只需要解压后并删除安装包即可)

2. 自定义词库

  1. 安装Nginx,并将nginx文件进行复制挂载。
docker run -p 80:80 --name nginx \-v /mydata/nginx/html:/usr/share/nginx/html \-v /mydata/nginx/logs:/var/log/nginx \-v /mydata/nginx/conf:/etc/nginx \-d nginx:1.10
  1. 新建分词词库
    在这里插入图片描述
  2. 修改/mydata/elasticsearch/plugins/ik/config/IKAnalyzer.cfg.xml中远程词库服务器,将地址改成nginx下html/es/fenci.txt即可。

SpringBoot整合ElasticSearch

1 .测试流程

  1. 导入依赖
1.8
//覆盖springboot自带的版本
7.4.2
org.elasticsearch.client
elasticsearch-rest-high-level-client
${
elasticsearch.version}
  1. 配置Elastic配置文件
@Configurationpublic class EsConfig {
public static final RequestOptions COMMON_OPTIONS; static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder(); COMMON_OPTIONS = builder.build(); } @Bean public RestHighLevelClient esRestClient(){
RestHighLevelClient client = new RestHighLevelClient( RestClient.builder(new HttpHost("192.168.56.10", 9200, "http"))); return client; }}
  1. 进行插入单元测试
@RunWith(SpringRunner.class)@SpringBootTest(classes = MailSearchApplication.class)public class MailSearchApplicationTests {
@Resource private RestHighLevelClient client; @Data class User {
private String userName; private Integer age; private String gender; } @Test public void indexData() throws IOException {
//ElasticSearch Java 参考文档 // https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-index.html#java-rest-high-document-index // 插入索引--创建叫users的数据库 IndexRequest indexRequest = new IndexRequest ("users"); indexRequest.id("2"); User user = new User(); user.setUserName("王五"); user.setAge(20); user.setGender("男"); String jsonString = JSON.toJSONString(user); //设置要保存的内容,指定数据和类型 indexRequest.source(jsonString, XContentType.JSON); //执行创建索引和保存数据 IndexResponse index = client.index(indexRequest, EsConfig.COMMON_OPTIONS); System.out.println(index); }}

踩坑记录:

在进行单元测试的时候,一直提示错误报找不到RestHighLevelClient,无法注入该bean,从而导致测试失败。最后发现在单元测试注解@SpringBootTest里,测试的类应该是springboot的启动类!!而不是该测试类!!!不写的话默认是测试的是启动类!!!可以不写!踩坑1个半小时,坑爹。

  1. 进行检索单元测试
/**     * @throws IOException     */    @Test    public void searchState() throws IOException {
//1. 创建检索请求 SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); // sourceBuilder.query(QueryBuilders.termQuery("city", "Nicholson")); // sourceBuilder.from(0); // sourceBuilder.size(5); // sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS)); QueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("state", "AK"); // .fuzziness(Fuzziness.AUTO) // .prefixLength(3) // .maxExpansions(10); sourceBuilder.query(matchQueryBuilder); SearchRequest searchRequest = new SearchRequest(); searchRequest.indices("bank"); searchRequest.source(sourceBuilder); //2. 执行检索 SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT); System.out.println(searchResponse); }
  1. 进行复杂检索
/**     * 复杂检索:在bank中搜索address中包含mill的所有人的年龄分布以及平均年龄,平均薪资     * # 相当于ELK中如下查询     * GET bank/_search     * {     *   "query": { # 查询出包含mill的     *     "match": {     *       "address": "Mill"     *     }     *   },     *   "aggs": { # 聚合查询     *     "ageAgg": {  # 自定义名字     *       "terms": { # terms聚合函数,看值的可能性分布     *         "field": "age",     *         "size": 10     *       }     *     },     *     "ageAvg": {  # 自定义名字     *       "avg": { # avg聚合函数,求平均值     *         "field": "age" # 聚合的字段     *       }     *     },     *     "balanceAvg": { # 自定义名字     *       "avg": { # 看balance的平均     *         "field": "balance" # 聚合的字段     *       }     *     }     *   },     *   "size": 0  # 不看详情     * }     *     * @throws IOException     */    @Test    public void searchData() throws IOException {
//1. 创建检索请求 SearchRequest searchRequest = new SearchRequest(); //1.1)指定索引 searchRequest.indices("bank"); //1.2)构造检索条件,可以理解为mybatisplus里面的querywrapper构造器,专门构造条件 SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); //相当于ELK中query/match。 sourceBuilder.query(QueryBuilders.matchQuery("address", "Mill")); //1.2.1)按照年龄分布进行聚合 TermsAggregationBuilder ageAgg = AggregationBuilders.terms("ageAgg").field("age").size(10); sourceBuilder.aggregation(ageAgg); //1.2.2)计算平均年龄 AvgAggregationBuilder ageAvg = AggregationBuilders.avg("ageAvg").field("age"); sourceBuilder.aggregation(ageAvg); //1.2.3)计算平均薪资 AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance"); sourceBuilder.aggregation(balanceAvg); System.out.println("检索条件:" + sourceBuilder); searchRequest.source(sourceBuilder); //2. 执行检索 SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT); System.out.println("检索结果:" + searchResponse); //3. 将检索结果封装为Bean SearchHits hits = searchResponse.getHits(); SearchHit[] searchHits = hits.getHits(); for (SearchHit searchHit : searchHits) {
String sourceAsString = searchHit.getSourceAsString(); Account account = JSON.parseObject(sourceAsString, Account.class); System.out.println(account); } //4. 获取聚合信息 Aggregations aggregations = searchResponse.getAggregations(); Terms ageAgg1 = aggregations.get("ageAgg"); for (Terms.Bucket bucket : ageAgg1.getBuckets()) {
String keyAsString = bucket.getKeyAsString(); System.out.println("年龄:" + keyAsString + " ==> " + bucket.getDocCount()); } Avg ageAvg1 = aggregations.get("ageAvg"); System.out.println("平均年龄:" + ageAvg1.getValue()); Avg balanceAvg1 = aggregations.get("balanceAvg"); System.out.println("平均薪资:" + balanceAvg1.getValue()); }

注意事项

ElasticSearch有一个属性叫nested,该属性是嵌入性属性,当我们要存入的数组里面是对象的时候,需要使用nested属性进行处理,不使用的话会出现检索错误的情况,如:

PUT testNested/_doc/1{
"group":"fans","user": [ {
"first":"John", "last":"Smith" }, {
"first":"Alice", "Second":"White" } ]}#我们执行query\bool\must查找user.first=Alice,user.last=smith的时候,按理来说是不存在这样的一个数据的,但是最终的结果是他给我们检索出来了,原因在于这过程中ElasticSearch进行了扁平化处理,将对象中相同属性放到了一起。为了不检索出错误的数据,我们就可以使用nested属性,在配置映射的时候指定该属性为nested属性即可,如:{
"mapping":{
"properties":{
"user":{
"type":"nested" } } }}

微服务商城ES检索业务实现

1. 检索DSL语句

  • 总体结构概览
GET product/_search{	# 主要的查询语句	"query":{},	# 排序	"sort":{},	# 分页,从第几页开始	"from":{},	# 分页,每页多少条	"size":{},	# 高亮显示	"highlight":{},	# 对上述查询的数据进行聚合分析,用于动态生成查询可供选择的内容	"aggs": {}}
  • 实际数据分析
GET grapesmail_product/_search{  "query": {    "bool": {      "must": [        { # 这里的skuTitle是用户检索的关键字          "match": {            "skuTitle": "第二次测试"          }        }      ],      # 使用filter的原因是因为被filter括起来不参与评分,提高性能      "filter": [        {          "term": {            "catalogId": "225"          }        },        { # 精确匹配查询(多)          "terms": {            "brandId": [              "10"            ]          }        },        { # 精确查询(单)          "term": {            "hasStock": "false"          }        },        {          "range": {            "skuPrice": {}          }        },        { # 要查询聚合属性必须使用如下查询方式,声明该属性为nested嵌入式对象          "nested": {          # 聚合的名称            "path": "attrs",            "query": {              "bool": {                "must": [                  {                    "term": {                      "attrs.attrId": {                        "value": "17"                      }                    }                  }                ]              }            }          }        }      ]    }  },  "sort": [    {      "skuPrice": {        "order": "desc"      }    }  ],  "from": 0,  "size": 5,  "highlight": {    "fields": {      "skuTitle": {}    },    # 高亮显示的开始标签    "pre_tags": "{}",    # 结束标签    "post_tags": "{}"  },  "aggs": {  # 聚合出所有的品牌id,根据id聚合出该品牌的图片信息、名称信息		"brandAgg": {			"terms": {				"field": "brandId",				"size": 10			},			# 子聚合查询,聚合出不同品牌id下品牌的名称			"aggs": {				"brandNameAgg": {					"terms": {						"field": "brandName",						"size": 10					}				},			# 子聚合查询,聚合出不同品牌id下品牌的图片				"brandImgAgg": {					"terms": {						"field": "brandImg",						"size": 10					}				}			}		},		"catalogAgg": {			"terms": {				"field": "catalogId",				"size": 10			},			"aggs": {				"catalogNameAgg": {					"terms": {						"field": "catalogName",						"size": 10					}				}			}		},		"attrs": {			"nested": {				"path": "attrs"			},			"aggs": {				"attrIdAgg": {					"terms": {						"field": "attrs.attrId",						"size": 10					},					"aggs": {						"attrNameAgg": {							"terms": {								"field": "attrs.attrName",								"size": 10							}						}					}				}			}		}	}}

2. DSL语句的Java实现

private SearchRequest buildSearchRequest(SearchParamVo param) {
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); /** * 模糊匹配,过滤(按照属性,分类,品牌,价格区间,库存) */ //1. 构建bool-query BoolQueryBuilder boolQueryBuilder=new BoolQueryBuilder(); //1.1 bool-must if(!StringUtils.isEmpty(param.getKeyword())){
boolQueryBuilder.must(QueryBuilders.matchQuery("skuTitle",param.getKeyword())); } //1.2 bool-fiter //1.2.1 catelogId if(null != param.getCatalog3Id()){
boolQueryBuilder.filter(QueryBuilders.termQuery("catalogId",param.getCatalog3Id())); } //1.2.2 brandId if(null != param.getBrandId() && param.getBrandId().size() >0){
boolQueryBuilder.filter(QueryBuilders.termsQuery("brandId",param.getBrandId())); } //1.2.3 attrs if(param.getAttrs() != null && param.getAttrs().size() > 0){
param.getAttrs().forEach(item -> {
//attrs=1_5寸:8寸&2_16G:8G BoolQueryBuilder boolQuery = QueryBuilders.boolQuery(); //attrs=1_5寸:8寸 String[] s = item.split("_"); String attrId=s[0]; String[] attrValues = s[1].split(":");//这个属性检索用的值 boolQuery.must(QueryBuilders.termQuery("attrs.attrId",attrId)); boolQuery.must(QueryBuilders.termsQuery("attrs.attrValue",attrValues)); NestedQueryBuilder nestedQueryBuilder = QueryBuilders.nestedQuery("attrs",boolQuery, ScoreMode.None); boolQueryBuilder.filter(nestedQueryBuilder); }); } //1.2.4 hasStock if(null != param.getHasStock()){
boolQueryBuilder.filter(QueryBuilders.termQuery("hasStock",param.getHasStock() == 1)); } //1.2.5 skuPrice if(!StringUtils.isEmpty(param.getSkuPrice())){
//skuPrice形式为:1_500或_500或500_ RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("skuPrice"); String[] price = param.getSkuPrice().split("_"); if(price.length==2){
rangeQueryBuilder.gte(price[0]).lte(price[1]); }else if(price.length == 1){
if(param.getSkuPrice().startsWith("_")){
rangeQueryBuilder.lte(price[1]); } if(param.getSkuPrice().endsWith("_")){
rangeQueryBuilder.gte(price[0]); } } boolQueryBuilder.filter(rangeQueryBuilder); } //封装所有的查询条件 searchSourceBuilder.query(boolQueryBuilder); /** * 排序,分页,高亮 */ //排序 //形式为sort=hotScore_asc/desc if(!StringUtils.isEmpty(param.getSort())){
String sort = param.getSort(); String[] sortFileds = sort.split("_"); SortOrder sortOrder="asc".equalsIgnoreCase(sortFileds[1])?SortOrder.ASC:SortOrder.DESC; searchSourceBuilder.sort(sortFileds[0],sortOrder); } //分页 searchSourceBuilder.from((param.getPageNum()-1)*EsContant.PRODUCT_PAGESIZE); searchSourceBuilder.size(EsContant.PRODUCT_PAGESIZE); //高亮 if(!StringUtils.isEmpty(param.getKeyword())){
HighlightBuilder highlightBuilder = new HighlightBuilder(); highlightBuilder.field("skuTitle"); highlightBuilder.preTags(""); highlightBuilder.postTags(""); searchSourceBuilder.highlighter(highlightBuilder); } /** * 聚合分析 */ //1. 按照品牌进行聚合 TermsAggregationBuilder brand_agg = AggregationBuilders.terms("brand_agg"); brand_agg.field("brandId").size(50); //1.1 品牌的子聚合-品牌名聚合 brand_agg.subAggregation(AggregationBuilders.terms("brand_name_agg") .field("brandName").size(1)); //1.2 品牌的子聚合-品牌图片聚合 brand_agg.subAggregation(AggregationBuilders.terms("brand_img_agg") .field("brandImg").size(1)); searchSourceBuilder.aggregation(brand_agg); //2. 按照分类信息进行聚合 TermsAggregationBuilder catalog_agg = AggregationBuilders.terms("catalog_agg"); catalog_agg.field("catalogId").size(20); catalog_agg.subAggregation(AggregationBuilders.terms("catalog_name_agg").field("catalogName").size(1)); searchSourceBuilder.aggregation(catalog_agg); //2. 按照属性信息进行聚合 NestedAggregationBuilder attr_agg = AggregationBuilders.nested("attr_agg", "attrs"); //2.1 按照属性ID进行聚合 TermsAggregationBuilder attr_id_agg = AggregationBuilders.terms("attr_id_agg").field("attrs.attrId"); attr_agg.subAggregation(attr_id_agg); //2.1.1 在每个属性ID下,按照属性名进行聚合 attr_id_agg.subAggregation(AggregationBuilders.terms("attr_name_agg").field("attrs.attrName").size(1)); //2.1.1 在每个属性ID下,按照属性值进行聚合 attr_id_agg.subAggregation(AggregationBuilders.terms("attr_value_agg").field("attrs.attrValue").size(50)); searchSourceBuilder.aggregation(attr_agg); SearchRequest searchRequest = new SearchRequest(new String[]{
EsContant.PRODUCT_INDEX},searchSourceBuilder); return searchRequest; }

转载地址:http://bcqof.baihongyu.com/

你可能感兴趣的文章
K-近邻算法:KNN
查看>>
solver及其配置
查看>>
图说C++对象模型:对象内存布局详解
查看>>
【Java基础】Java类的加载和对象创建流程的详细分析
查看>>
JAVA多线程之volatile 与 synchronized 的比较
查看>>
Java多线程知识点总结
查看>>
Java集合框架知识梳理
查看>>
java中IO流知识梳理
查看>>
word2010如何保持在公式后面键入空格后或添加文字不变小?
查看>>
笔试题(一)—— java基础
查看>>
笔试题(二)—— sql语句
查看>>
Redis学习笔记(二)— 在linux下搭建redis服务器
查看>>
Redis学习笔记(三)—— 使用redis客户端连接windows和linux下的redis并解决无法连接redis的问题
查看>>
Eclipse配置错误——An internal error occurred during: "Building workspace".GC overhead limit exceeded
查看>>
Intellij IDEA使用(一)—— 安装Intellij IDEA(ideaIU-2017.2.3)并完成Intellij IDEA的简单配置
查看>>
Intellij IDEA使用(二)—— 在Intellij IDEA中配置JDK(SDK)
查看>>
Intellij IDEA使用(三)——在Intellij IDEA中配置Tomcat服务器
查看>>
Intellij IDEA使用(四)—— 使用Intellij IDEA创建静态的web(HTML)项目
查看>>
Intellij IDEA使用(五)—— Intellij IDEA在使用中的一些其他常用功能或常用配置收集
查看>>
Intellij IDEA使用(六)—— 使用Intellij IDEA创建Java项目并配置jar包
查看>>