Elastic search on kubernetes

Getting Elastic Search running on Kubernetes posted on 11th Jan 2020


Elastic Search is pretty awesome, but with that awesomeness comes a lot of complication both in setup and use. One of the decisions you'll need to make is where to run it. There are plenty of options from a fully hosted one, to running it on your server, but for me the best option was to run it on my Kubernetes Cluster which is where the API that is going to use it, and the redis cache in front of it is also running. 

Luckily, the Elastic Search guys have put together a Kubernetes Operator, which simplifies what would otherwise be a super complex setup, especially if you are just starting with Elastic (or indeed with Kubernetes).

Kubernetes operator

If you are hosting stateless containers on Kubernetes, getting things going is pretty simple. Typically a GraphQL or REST Api is stateless, which makes GKE a great hosting solution for those kind of apps. However a database is not stateless (neither are redis nor elastic). GKE now has the concept of a statefulset which allows clusters to be created from groups of pods and thus enable the hosting of complex apps such as databases. The problem though is that maintaining related services associated with such deployments is complex, especially when there are changes to manage. If you are familiar with helm charts you'll know these can provide a templated way to simplify the set up of complex, integrated apps on kubernetes, but to deal with operational changes you need something else. This is where operators come in. They manage the operational lifecycle of complex Kubernetes deployments making sure that everything lines up, and making it easy for developers to get started with deploying them.

Setting up Elastic

The Elastic search site documents their operator, and it really is pretty straightforward to get going (although eventual tuning will be needed to get it just right for your use case), so this tutorial will stick fairly close to their quickstart configuration. 

Prequisites

I'll assume you have a Kubernetes cluster ready to go, but you may need to scale up a few nodes. I added 2 nodes to accommodate this implementation, so here's the VM pool

NAME                                 STATUS   ROLES    AGE    VERSION
gke-fid-default-pool-8c0eb8cf-0rkg   Ready    <none>   17h    v1.14.8-gke.12
gke-fid-default-pool-8c0eb8cf-261m   Ready    <none>   2d3h   v1.14.8-gke.12
gke-fid-default-pool-8c0eb8cf-2jdf   Ready    <none>   20h    v1.14.8-gke.12
gke-fid-default-pool-8c0eb8cf-6hpj   Ready    <none>   24h    v1.14.8-gke.12
gke-fid-default-pool-8c0eb8cf-8pzk   Ready    <none>   44h    v1.14.8-gke.12
gke-fid-default-pool-8c0eb8cf-p2x4   Ready    <none>   27h    v1.14.8-gke.12

Creating the operator

The operator itself runs on Kubernetes in its own namespace (elastic-system) . Here's the default configuration
kubectl apply -f https://download.elastic.co/downloads/eck/1.0.0-beta1/all-in-one.yaml

And you can monitor it running up if you like
kubectl -n elastic-system logs -f statefulset.apps/elastic-operator

Creating the cluster

Next, create the elasticsearch cluster. In this case, I'll use the default quickstart from elastic, but change from 1 to 3 nodes, add a decent amount of persistent storage, change the the storage class to fast (which I've previously assigned to ssd storage), add a loadbalancer (to expose the thing externally for testing), a custom domain and use a self signed certificate (which elastic helpfully provides as a kubernetes secret). 

Here's createcluster.yaml, with my modifications  to the default in red
apiVersion: elasticsearch.k8s.elastic.co/v1beta1
kind: Elasticsearch
metadata:
  name: quickstart
spec:
  version: 7.5.1
  http:
    service:
      spec:
        type: LoadBalancer
    tls:
      selfSignedCertificate:
        subjectAltNames:
        - dns: my.exampledomain.com
  nodeSets:
  - name: default
    count: 3
    config:
      node.master: true
      node.data: true
      node.ingest: true
      node.store.allow_mmap: false
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 20Gi
        storageClassName: fast


Create the cluster
kubectl apply -f createcluster.yaml

When the status turns Green, it's up and running.
kubectl get elasticsearch

NAME         HEALTH   NODES   VERSION   PHASE   AGE
quickstart   green    3       7.5.1     Ready   24h

You'll then have a series of pods and services like this
kubectl get all | grep quickstart

pod/quickstart-es-default-0                                      1/1     Running   
pod/quickstart-es-default-1                                      1/1     Running
pod/quickstart-es-default-2                                      1/1     Running

service/quickstart-es-default                       ClusterIP           
service/quickstart-es-http                          LoadBalancer   10.15.240.28    xx.xxx.xxx.xxx   9200:30731/TCP

statefulset.apps/quickstart-es-default   3/3   

External IP

Next, we'll need to redirect the dns for the custom domain to the exposed ip address - which you can get from the loadbalancer service like this
kubectl get service quickstart-es-http


NAME                 TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)          AGE
quickstart-es-http   LoadBalancer   10.15.240.28   35.xxx.xxx.xxx   9200:30731/TCP   24h

Head off to your registrar and point your domain to that external ip.

Getting the password

Elastic creates a default user and generates a password in a Kubernetes secret. To start with I'm going to use that, but in a later article, will also show how to use a certificate for authentication. 
First check the password secret is there
kubectl get secret | grep es-elastic-user

quickstart-es-elastic-user              Opaque                                1      24h

Check it works, by connecting to the cluster using the password
ELPASSWORD=$(kubectl get secret quickstart-es-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode)
curl -u elastic:$ELPASSWORD -k https://my.exampledomain.com:9200

You should get a response like this
{
  "name" : "quickstart-es-default-0",
  "cluster_name" : "quickstart",
  "cluster_uuid" : "lMqADZTyQ8eUP_IoX3etDQ",
  "version" : {
    "number" : "7.5.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "3ae9ac9a93c95bd0cdc054951cf95d88e1e18d96",
    "build_date" : "2019-12-16T22:57:37.835892Z",
    "build_snapshot" : false,
    "lucene_version" : "8.3.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

You'll need the password
echo $ELPASSWORD

Connecting to the service externally

I'm using the Node elastic client, so creating a client would be something like this.
          
      const uri = 'https://my.example.domain:9200';
      const auth = {
        username: 'elastic',
        password: 'thepassword'
      };
      const client = new elasticsearch.Client({
        node: uri,
        auth,
        ssl: {
          rejectUnauthorized: false
        }
      });

Since I'm using a self signed certificate for now, and not a well known CA, the connection would be rejected so rejectUnauthorized: false allows the connection to happen anyway.

When this is all running, it will be accessed from an API running internally on the same Kubernetes cluster so there will be no need to expose it externally other than for testing - but more of that later.


Since G+ is closed, you can now star and follow post announcements and discussions on github, here 

Learning Apps Script, (and transitioning from VBA) are covered comprehensively in my my book, Going Gas - from VBA to Apps script, All formats are available from O'ReillyAmazon and all good bookshops. You can also read a preview on O'Reilly

If you prefer Video style learning I also have two courses available. also published by O'Reilly.
Google Apps Script for Developers and Google Apps Script for Beginners.

Comments