[Elasitcsearch] v7.0.1 Indexing / Mapping / Searching

Notice

Recent Posts

Recent Comments

Link

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Tags more

Archives

Today

Total

관리 메뉴

코린이의 기록

[Elasitcsearch] v7.0.1 Indexing / Mapping / Searching 본문

ELK

[Elasitcsearch] v7.0.1 Indexing / Mapping / Searching

코린이예요 2019. 5. 8. 18:04

Elasticsearch 기본 설명 참고 : https://victorydntmd.tistory.com/308?category=742451

Cluster 상태 확인

$ curl -XGET 'localhost:9200/_cat/health?v&pretty'

Cluster에 있는 노드 확인

$ curl -XGET 'localhost:9200/_cat/nodes?v&pretty'

Indexing : data를 추가

현재 보유하고 있는 Index 확인하기 : GET _cat/indices

$ curl -XGET 'localhost:9200/_cat/indices?v&pretty'

Index 생성하기 : PUT /[index]

classes라는 Index를 생성해보자

$ curl -XPUT 'http://localhost:9200/classes?pretty'

Indexing 은 CRUD중 C, U 과 유사하다. Index scheme은 name/tpye/id로 이루어져 있다. name과 type은 mandatory이며 id는 지정해주지 않으면 별도로 Elasticsearch가 알아서 제공해준다.

결과

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "classes"
}

Index 조회하기 : GET /[index]

$ curl -XGET 'localhost:9200/classes?pretty'

결과

{
  "classes" : {
    "aliases" : { },
    "mappings" : { },
    "settings" : {
      "index" : {
        "creation_date" : "1557392059842",
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "uuid" : "s9GoHW4aR4Of2acgXdBjgg",
        "version" : {
          "created" : "7000199"
        },
        "provided_name" : "classes"
      }
    }
  }
}

Document 생성 하기 POST /[index]/[type]/[id]

classes / class / 1

Index Type ID

Document

{

"title":"algorithm",

"professor":"John"

}

를 생성해보자.

$ curl -XPOST 'localhost:9200/classes/class/1?pretty' -d '{"title":"algorithm", "professor":"John"}' -H 'Content-Type: application/json'

참고로 Elasticsearch 5.5.0 버전 이상일때 header에 Content-Type을 지정해주지 않으면 error가 발생한다. 따라서 Document 생성 시 -H 'Content-Type: application/json' 을 꼭 입력해주자.

여기서 pretty는 결과 값을 조금 더 깔끔하게 보여주기 위함이다. pretty를 넣어주지 않으면 한줄로 결과값을 쭉 보여줌

ex. {"_index":"classes","_type":"class","_id":"1","_version":1,"result":"created","_shards": "total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1}

결과

{
  "_index" : "classes",
  "_type" : "class",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

처음 생성하면 _version이 1이고 result가 created가 된다.

Document 조회하기 GET /[index]/[type]/[id]

위에서 생성한 index와 document가 잘 만들어졌나 확인해보자

curl -XGET 'localhost:9200/classes/class/1?pretty'

결과

{
  "_index" : "classes",
  "_type" : "class",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "title" : "algorithm",
    "professor" : "John"
  }
}

Document 수정하기 POST /[index]/[type]/[id]

만약 위 document를 수정하려면 변경 할 부분을 바꿔서 생성할때와 같은 방식으로 입력한다.

$ curl -XPOST 'localhost:9200/classes/class/1?pretty' -d '{"title":"algorithm", "professor":"John"}' -H 'Content-Type: application/json'

결과

{
  "_index" : "classes",
  "_type" : "class",
  "_id" : "1",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 1,
  "_primary_term" : 1
}

생성 후 수정하면 _version이 증가되고 result가 updated가 된다.

Document Bulk로 생성하기 POST /[index]/[type]/[id] @[json file]

그런데 저 json 형식의 document를 일일히 적지 않고 파일로 만들어서 생성하고 싶다면 아래와 같이 실행해보자

우선 json파일을 하나 생성한다.

$ vi data.json

{
"title":"algorithm",
"professor":"John"
}

$ curl -XPOST 'localhost:9200/classes/class/1?pretty' -d @data.json -H 'Content-Type: application/json'

결과

{
  "_index" : "classes",
  "_type" : "class",
  "_id" : "1",
"_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

$ vi data.json

{ "index" : { "_index" : "classes", "_type" : "class", "_id" : "1" } }

{ "title":"algorithm", "professor":"John"}

{ "index" : { "_index" : "classes", "_type" : "class", "_id" : "2" } }

{ "title":"mathematics", "professor":"Kim"}

{ "index" : { "_index" : "classes", "_type" : "class", "_id" : "3" } }

{ "title":"History", "professor":"Yoon"}

주의!

json 마지막줄은 개행 '\n'으로 끝나야한다!

안그럼 "The bulk request must be terminated by a newline [\\n]" error가 발생함.

$ curl -XPOST 'localhost:9200/classes/class/_bulk?pretty' --data-binary @data.json -H 'Content-Type: application/json'

{
  "took" : 13,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "classes",
        "_type" : "class",
        "_id" : "1",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 5,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "index" : {
        "_index" : "classes",
        "_type" : "class",
        "_id" : "2",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 6,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "index" : {
        "_index" : "classes",
        "_type" : "class",
        "_id" : "3",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 7,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}

현재 보유하고있는 index 삭제하기 DELETE /[index]

$ curl -XDELETE 'localhost:9200/classes?pretty&pretty'

결과

{
"acknowledged" : true
}

Mapping : Index의 schema를 setting

즉 mapping은 관계형 데이터베이스(RBDMS)의 schema와 비슷한 개념으로, Elasticsearch의 index에 들어가는 데이터의 타입을 정의하는 것이다. mapping 은 반드시 해야하는 작업은 해야 하는것은 아니지만 하는것이 좋음. Elasticsearch는 date field를 text로 인지 하여 저장할수 있기때문이다. Elasticsearch의 version별 type history를 살펴보면,

5.0 : started enforcing that fields that share the same name across multiple types have compatible mappings.
6.0 : started preventing new indices from having more than one type and deprecated the _default_ mapping.
7.0 : deprecated APIs that accept types, introduced new typeless APIs, and removed support for the _default_ mapping.
8.0 : will remove APIs that accept types.

Elasticsearch 5.6.0

Setting index.mapping.single_type: true on an index will enable the single-type-per-index behaviour which will be enforced in 6.0.
The join field replacement for parent-child is available on indices created in 5.6.

Elasticsearch 6.x

Indices created in 5.x will continue to function in 6.x as they did in 5.x.
Indices created in 6.x only allow a single-type per index. Any name can be used for the type, but there can be only one. The preferred type name is _doc, so that index APIs have the same path as they will have in 7.0: PUT {index}/_doc/{id} and POST {index}/_doc
The _type name can no longer be combined with the _id to form the _uidfield. The _uid field has become an alias for the _id field.
New indices no longer support the old-style of parent/child and should use the join field instead.
The _default_ mapping type is deprecated.
In 6.7, the index creation, index template, and mapping APIs support a query string parameter (include_type_name) which indicates whether requests and responses should include a type name. It defaults to true, and should be set to an explicit value to prepare to upgrade to 7.0. Not setting include_type_name will result in a deprecation warning. Indices which don’t have an explicit type will use the dummy type name _doc.

Elasticsearch 7.x

Specifying types in requests is deprecated. For instance, indexing a document no longer requires a document type. The new index APIs are PUT {index}/_doc/{id} in case of explicit ids and POST {index}/_doc for auto-generated ids. Note that in 7.0, _doc is a permanent part of the path, and represents the endpoint name rather than the document type.
The include_type_name parameter in the index creation, index template, and mapping APIs will default to false. Setting the parameter at all will result in a deprecation warning.
The _default_ mapping type is removed.

Elasticsearch 8.x

Specifying types in requests is no longer supported.
The include_type_name parameter is removed.

5.0에서는 multiple type을 지원했으나, 6.0에서부터 단 하나의 type만 가질 수 있도록 하고, _default_mapping을 deprecated 시켰다. 7.0에서는 accept type API를 deprecated시키고 새로운 typeless API를 소개했다. 그리고 _default_ mapping을 제거했다. 8.0에서는 accept type API를 제거할 예정이다.

참고

https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html

Type 종류

a simple type like text, keyword, date, long, double, boolean or ip.
a type which supports the hierarchical nature of JSON such as object or nested.
or a specialised type like geo_point, geo_shape, or completion.

Mapping 생성하기 PUT [index]/[type]/_mapping

$ curl -XPUT 'http://localhost:9200/classes/class/_mapping?include_type_name=true&pretty' -d @data_mapping2.json -H 'Content-Type:application/json'

data_mapping.json

{

"class": {

"properties": {

"title": {

"type": "text"

"professor": {

"type": "text"

"major": {

"type": "text"

"semester": {

"type": "text"

"student_count": {

"type": "integer"

"unit": {

"type": "integer"

"submit_date": {

"type": "date",

"format": "yyyy-MM-dd"

"school_location": {

"type": "geo_point"

}

참고로 Elasticsearch 6 이상 부터는 string이 deprecated 되었다고하니 "text"나 "keyword"를 사용하도록 한다.

결과

{
"acknowledged" : true
}

Mapping 가져오기 GET /[index]/_mapping

$ curl -XGET 'localhost:9200/classes/_mapping?pretty'

결과

{
  "classes" : {
    "mappings" : {
      "properties" : {
        "major" : {
          "type" : "text"
        },
        "professor" : {
          "type" : "text"
        },
        "school_location" : {
          "type" : "geo_point"
        },
        "semester" : {
          "type" : "text"
        },
        "student_count" : {
          "type" : "integer"
        },
        "submit_date" : {
          "type" : "date",
          "format" : "yyyy-MM-dd"
        },
        "title" : {
          "type" : "text"
        },
        "unit" : {
          "type" : "integer"
        }
      }
    }
  }
}

Searching : 특정 Index와 type을 가지고 search

POST index/type/_search

$ curl -XPOST 'localhost:9200/classes/class/_search?q=professor:Yoon&pretty'

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.3862944,
    "hits" : [
      {
        "_index" : "classes",
        "_type" : "class",
        "_id" : "3",
        "_score" : 1.3862944,
        "_source" : {
          "title" : "History",
          "professor" : "Yoon"
        }
      }
    ]
  }
}

Reference

https://victorydntmd.tistory.com/312?category=742451

https://www.edureka.co/blog/elk-stack-tutorial/

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

https://www.youtube.com/watch?v=lt6oPHjZMXg&list=PLVNY1HnUlO24LCsgOxR_eK2Yi4sOgH9Pg&index=4

저작자표시 비영리 동일조건 (새창열림)

'ELK' 카테고리의 다른 글

[ELK] mac에서 elasticsearch 실행 시 unsupported OS ERROR 발생시 해결 방법 (0)	2020.08.25
[ELK] Integrate Filebeat + Kafka + Logstash + Elasticsearch + Kibana (1)	2019.05.16
[Elasticsearch] network.host 설정 bootstrap checks failed (0)	2019.05.07
[Logstash] Logstash 7.x 설치 및 다운로드 (0)	2019.05.07
[Kibana] Kibana XXX with some different versions of Elasticsearch. (0)	2019.05.07

'ELK' Related Articles

Comments

코린이의 기록

[Elasitcsearch] v7.0.1 Indexing / Mapping / Searching 본문

[Elasitcsearch] v7.0.1 Indexing / Mapping / Searching

Indexing : data를 추가

Mapping : Index의 schema를 setting

Searching : 특정 Index와 type을 가지고 search

'ELK' 카테고리의 다른 글

티스토리툴바