java 字段 别名_ES创建mapping时字段别名

ES默认是动态创建索引和索引类型的mapping的,但是在学习的时候还能这样用,在生产中一定是手动制定mapping!在生产中经常会遇到这样的需求,想用某个字段进行统计,又想对该字段进行模糊查询,解决这种需求的方法就是对该字段创建别名!

mapping结构如下:

1 {2 "settings": {3 "index": {4 "analysis": {5 "filter": {6 "english_keywords": {7 "type" : "keyword_marker",8 "keywords": [9 "topsec"

10 ]11 },12 "english_stemmer": {13 "type" : "stemmer",14 "language" : "english"

15 },16 "english_possessive_stemmer": {17 "type" : "stemmer",18 "language" : "possessive_english"

19 },20 "english_stop": {21 "type" : "stop",22 "stopwords" : "_english_"

23 }24 },25 "analyzer": {29 "english": {30 "type" : "custom",31 "filter": [32 "lowercase",33 "english_stop"

34 ],35 "tokenizer" : "standard"

36 },37 "ik": {38 "filter" : ["lowercase"],39 "type" : "custom",40 "tokenizer" : "ik_max_word"

41 },42 "html": {43 "filter": [44 "lowercase",45 "english_stop"

46 ],47 "char_filter": [48 "html_strip"

49 ],50 "type" : "custom",51 "tokenizer" : "standard"

52 },53 "lower": {54 "filter" : "lowercase",55 "type" : "custom",56 "tokenizer" : "keyword"

57 }58 }59 },60 "number_of_shards" : "1",61 "number_of_replicas" : "0"

62 }63 },64 "mappings": {65 "test": {66 "_all": {67 "enabled" : false

68 },69 "properties": {70 "name": {71 "type" : "keyword"

72 },73 "age": {74 "type" : "keyword",75 "fields": {76 "cn": {77 "analyzer" : "ik",78 "type" : "text"

79 }80 }81 },82

83 "address": {84 "type" : "text"

85 }86 }87 }88 }89 }

字段age的"type" : "keyword",不分词,然后起个别名cn,对它使用ik分词器进行分词!插入四条数据

dac5d953d3539e9b33d8e0c00689a7d7.png

用age字段对数据进行统计的时候,需要用不分词的age,并且需要使用全匹配规则,语句:

1 {2 "query": {3 "bool": {4 "must": [5 {6 "term": {7 "age": "北京市海淀区西二旗中关村西门"

8 }9 }10 ],11 "must_not": [],12 "should": []13 }14 },15 "from": 0,16 "size": 10,17 "sort": [],18 "aggs": {}19 }

结果:

2558c0732f911256b14da952debbef26.png

使用age的分词age.cn进行统计是有问题的,运行的结果说明对age的别名age.cn进行分词,查询条件必须匹配分词器对age的内容进行分词的结果进行匹配,

1 {2 "query": {3 "bool": {4 "must": [5 {6 "term": {7 "age.cn": "北京市海淀区西二旗中关村西门"

8 }9 }10 ],11 "must_not": [],12 "should": []13 }14 },15 "from": 0,16 "size": 10,17 "sort": [],18 "aggs": {}19 }

结果:

2646de4666da178e67e3705a525595b3.png

1 {2 "query": {3 "bool": {4 "must": [5 {6 "term": {7 "age.cn": "北京市"

8 }9 }10 ],11 "must_not": [],12 "should": []13 }14 },15 "from": 0,16 "size": 10,17 "sort": [],18 "aggs": {}19 }

结果:

4e0400464b0a1a7c80790128ee06f21a.png

如果使用match来统计的话也会有问题,会把不正确的数据也统计出来,使用 match进行统计会把查询条件与内容进行匹配,根据匹配度进行打分,分数高的说明匹配度高,会排在上面

1 {2 "query": {3 "bool": {4 "must": [5 {6 "match": {7 "age.cn": "北京市海淀区西二旗中关村"

8 }9 }10 ],11 "must_not": [],12 "should": []13 }14 },15 "from": 0,16 "size": 10,17 "sort": [],18 "aggs": {}19 }

结果:

06832da9c0c14920a502e4dd318471e6.png

下面就是按匹配度打分排名的结果

1 {2 "query": {3 "bool": {4 "must": [5 {6 "match": {7 "age.cn": "北京市昌平区"

8 }9 }10 ],11 "must_not": [],12 "should": []13 }14 },15 "from": 0,16 "size": 10,17 "sort": [],18 "aggs": {}19 }

结果:

43234c23beafda8b48f3a0e80c2aa8ef.png

总结:统计就用term,不分词,全匹配;模糊查询就用match,分词,不用全匹配!

若有不正之处,请谅解和批评指正,不胜感激!!!!!欢迎大家留言讨论!!!


版权声明:本文为weixin_35728216原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。