使用Docker构建Elasticsearch
首先
我平时从事的是针对土木领域的人工智能应用研究,但隔壁部门向我咨询说:“我们有数千万个文本文件,但由于文件数量太多,搜索速度太慢了。有什么办法可以改善吗?”
我以前在研究中使用过Hadoop和HDFS,但当时是针对GIS进行的研究。而且,我一直认为Hadoop只是用于批处理!因此,当我在寻找一个专注于全文搜索的好工具时,我偶然遇到了Elasticsearch(尽管现在感觉有点晚了)。
为了尝试一下,我首先想构建Elasticsearch。但是,我完全忘记了docker-compose的编写方法,所以我将分开发布构建方法…(老了记忆力就是这样)
所以,这次我们来创建Elasticsearch的Docker镜像。
我所参考的网站
在此搭建过程中,我参考了以下网站。非常感谢。
使用Docker进行初次使用Elasticsearch
开始使用Elasticsearch的入门教程
建立环境
CPUIntel(R) Core(TM) i7-9700K CPU @ 3.60GHzメモリ容量32GBOSUbuntu 18.04.4 LTS (Bionic Beaver)DockerDocker version 20.10.5, build 55c4c88docker-composeversion 1.16.1, build 6d1ac21
开始建立Elasticsearch。
创建Elasticsearch的Docker镜像
首先,首先要创建Dockerfile。
查看DockerHub上的elasticsearch,当前最新版本是7.13.2,所以在Dockerfile的第一行中写入下载7.13.2版本的docker映像。
因为我是日本人,想要支持日语,所以也安装了kuromoji插件(第二行)。
此外,为了支持最新的术语,还安装了Neologd插件(第三行)。
FROM docker.elastic.co/elasticsearch/elasticsearch:7.13.2
RUN elasticsearch-plugin install analysis-kuromoji
RUN elasticsearch-plugin install org.codelibs:elasticsearch-analysis-kuromoji-ipadic-neologd:7.1.0
(追記开始)
先前,在↑中“RUN elasticsearch-plugin install org.codelibs:elasticsearch-analysis-kuromoji-ipadic-neologd:7.1.0”中提到了这个问题,但由于在下一篇文章中使用docker-compose up时出现了Java错误,因此将其删除。错误详情将在文章末尾追加。
(追记结束)
既然Dockerfile也已经准备好了,那就执行docker build吧。
$ docker build -f ./Dockerfile .
Sending build context to Docker daemon 4.096kB
Step 1/3 : FROM docker.elastic.co/elasticsearch/elasticsearch:7.13.2
7.13.2: Pulling from elasticsearch/elasticsearch
ddf49b9115d7: Pull complete
815a15889ec1: Pull complete
ba5d33fc5cc5: Pull complete
976d4f887b1a: Pull complete
9b5ee4563932: Pull complete
ef11e8f17d0c: Pull complete
3c5ad4db1e24: Pull complete
Digest: sha256:1cecc2c7419a4f917a88c83180335bd491d623f28ac43ca7e0e69b4eca25fbd5
Status: Downloaded newer image for docker.elastic.co/elasticsearch/elasticsearch:7.13.2
---> 11a830014f7c
Step 2/3 : RUN elasticsearch-plugin install analysis-kuromoji
---> Running in 0cdc1f7f3102
-> Installing analysis-kuromoji
-> Downloading analysis-kuromoji from elastic
[=================================================] 100%??
-> Installed analysis-kuromoji
-> Please restart Elasticsearch to activate any plugins installed
Removing intermediate container 0cdc1f7f3102
---> 144040a82003
Step 3/3 : RUN elasticsearch-plugin install org.codelibs:elasticsearch-analysis-kuromoji-ipadic-neologd:7.1.0
---> Running in 71997e0aca6e
-> Installing org.codelibs:elasticsearch-analysis-kuromoji-ipadic-neologd:7.1.0
-> Downloading org.codelibs:elasticsearch-analysis-kuromoji-ipadic-neologd:7.1.0 from maven central
[=================================================] 100%??
Warning: sha512 not found, falling back to sha1. This behavior is deprecated and will be removed in a future release. Please update the plugin to use a sha512 checksum.
-> Installed analysis-kuromoji-ipadic-neologd
-> Please restart Elasticsearch to activate any plugins installed
Removing intermediate container 71997e0aca6e
---> f53988aa2593
Successfully built f53988aa2593
根据日志来看似乎没有问题。但是,通过 docker images 来确认一下。
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.elastic.co/elasticsearch/elasticsearch 7.13.2 11a830014f7c 3 weeks ago 1.02GB
接下来,我打算编写docker-compose.yml文件并启动它。
(补充内容如下)
以下是我在追记中写的Java错误内容。
es01 | "stacktrace": ["java.lang.NoSuchMethodError: 'void org.elasticsearch.index.analysis.AbstractTokenizerFactory.<init>(org.elasticsearch.index.IndexSettings, org.elasticsearch.common.settings.Settings)'",
es01 | "at org.codelibs.elasticsearch.kuromoji.ipadic.neologd.index.analysis.KuromojiTokenizerFactory.<init>(KuromojiTokenizerFactory.java:50) ~[?:?]",
es01 | "at org.elasticsearch.index.analysis.AnalysisRegistry.buildMapping(AnalysisRegistry.java:433) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.index.analysis.AnalysisRegistry.buildTokenizerFactories(AnalysisRegistry.java:275) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:203) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.index.IndexModule.newIndexService(IndexModule.java:431) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:663) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:566) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService.validateTemplate(MetadataIndexTemplateService.java:1288) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService.access$300(MetadataIndexTemplateService.java:83) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService$6.execute(MetadataIndexTemplateService.java:775) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:48) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:691) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:313) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:208) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:62) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:140) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:139) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:177) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:241) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:204) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01 | "at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]",
es01 | "at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]",
es01 | "at java.lang.Thread.run(Thread.java:831) [?:?]"] }
es01 | fatal error in thread [elasticsearch[es01][masterService#updateTask][T#1]], exiting
es01 | java.lang.NoSuchMethodError: 'void org.elasticsearch.index.analysis.AbstractTokenizerFactory.<init>(org.elasticsearch.index.IndexSettings, org.elasticsearch.common.settings.Settings)'
es01 | at org.codelibs.elasticsearch.kuromoji.ipadic.neologd.index.analysis.KuromojiTokenizerFactory.<init>(KuromojiTokenizerFactory.java:50)
es01 | at org.elasticsearch.index.analysis.AnalysisRegistry.buildMapping(AnalysisRegistry.java:433)
es01 | at org.elasticsearch.index.analysis.AnalysisRegistry.buildTokenizerFactories(AnalysisRegistry.java:275)
es01 | at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:203)
es01 | at org.elasticsearch.index.IndexModule.newIndexService(IndexModule.java:431)
es01 | at org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:663)
es01 | at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:566)
es01 | at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService.validateTemplate(MetadataIndexTemplateService.java:1288)
es01 | at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService.access$300(MetadataIndexTemplateService.java:83)
es01 | at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService$6.execute(MetadataIndexTemplateService.java:775)
es01 | at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:48)
es01 | at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:691)
es01 | at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:313)
es01 | at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:208)
es01 | at org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:62)
es01 | at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:140)
es01 | at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:139)
es01 | at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:177)
es01 | at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673)
es01 | at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:241)
es01 | at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:204)
es01 | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
es01 | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
es01 | at java.base/java.lang.Thread.run(Thread.java:831)
es01 exited with code 1
(附言如上)