Our learning is that operating Kubernetes is complex. There are a lot of moving parts. And learning how to operate Kuber

Author : 2sofia
Publish Date : 2021-01-05 07:31:08


An important aspect of Kubernetes is to think about how developers are going to interact with the cluster and deploy their workloads. We wanted to keep things simple and easy to scale. We are converging towards Kustomize, Skaffold along with a few home-grown CRDs as the way for developers to deploy and manage applications. Having said that, any team is free to use whatever tools they would like to use to interact with the cluster as long as they are open-source and built on open standards.

We finalized on Prometheus. Prometheus is almost a defacto metrics infrastructure today. CNCF and Kubernetes love it very much. It works really well within the Grafana ecosystem. And we love Grafana! Our only problem was that we were using InfluxDB. We have decided to migrate away from InfluxDB and totally commit to Prometheus.

Metrics, logs, service discovery, distributed tracing, configuration and secret management, CI/CD, local development experience, auto-scaling on custom metrics are all things to take care of and make a decision. These are only some of the things that we are calling out. There are definitely more decisions to make and more infrastructure to set up. An important area is how are your developers going to work with Kubernetes resources and manifests — more on this later in this blog post.

We mostly operate out of the Singapore region on AWS. At the time we started our journey with Kubernetes, EKS was not available as a service in the Singapore region. So we had to set up our own Kubernetes cluster on EC2 using kops.

Logs have always been a big problem for us. We have struggled to create a stable logging platform using ELK. We find ELK full of features that are not realistically used by our team. Those features come at a cost. Also, we think there are inherent challenges in using Elasticsearch for logs, making it an expensive solution for logs. We finalized on Loki by Grafana. It’s simple. It has necessary features for our team’s needs. It’s extremely cost-effective. But most importantly, it has a superior UX owing to it’s query language being very similar to PromQL. Also, it works well with Grafana. So that brings the entire metrics monitoring and logging experience together in one user interface.

There are a number of ways to use Kubernetes in your development workflow. We mostly zeroed down to two options — Telepresence.io and Skaffold. Skaffold is capable of watching your local changes and constantly deploying them to your Kubernetes cluster. Telepresence, on the other hand, allows you to run a service locally while setting up a transparent network proxy with the Kubernetes cluster so that your local service can communicate with other services in Kubernetes as if it was deployed in the cluster. It is a matter of personal opinions and preferences. It has been hard to decide on one tool. We are mostly experimenting with Telepresence right now but we have not abandoned the possibility of Skaffold being a better tooling for us. Only time will tell what we decide to use, or perhaps we use both. There are other solutions as well such as Draft that deserve a mention.

We were using Jenkins before migrating to Kubernetes. After migrating to Kubernetes, we decided to stick to Jenkins. Our experience so far has been that Jenkins is not the best solution for working with cloud-native infrastructure. We found ourselves doing a lot of plumbing using Python, Bash, Docker and scripted/declarative Jenkins pipelines to make it all work. Building and maintaining these tools and pipelines started to feel expensive. We are right now exploring Tekton and Argo Workflows as our new CI/CD platform. There are more options you can explore in the CI/CD landscape such as Jenkins X, Screwdriver, Keptn, etc.

One big learning for us was we could have taken a different and lesser resistant path to adopting Kubernetes. We were just bought into Kubernetes as the only solution that we didn’t even care to evaluate other options.

We are not doing distributed tracing just yet. However, we plan to invest into that area real soon. Like with logging, our desire is to have distributed tracing be available next to metrics and logging in Grafana to deliver a more integrated observability experience to our development teams.

We will see in this blog post that migration and operations on Kubernetes are not the same as deploying on cloud VMs or bare metals. There is a learning curve for your cloud engineering and development teams. It might be worth it for your team to go through it. But do you need to do that now is the question. You must try to answer that clearly.

To avoid all this, we decided to use Consul, Vault and Consul Template for configuration management. We run Consul Template as an init container today and plan to run it as a side car in pods so that it can watch for configuration changes in Consul and refresh expiring secrets from Vault and gracefully reload application processes.

http://molos.bodasturias.com/jph/Video-EHC-Eisbaren-Berlin-Iserlohn-Roosters-v-en-gb-gls30122020-.php

http://www.ectp.org/mdt/video-HC-TWK-Innsbruck-HC-Bolzano-v-en-gb-vin-.php

http://molos.bodasturias.com/jph/v-ideos-EHC-Eisbaren-Berlin-Iserlohn-Roosters-v-en-gb-uyv-.php

http://www.ectp.org/mdt/videos-HC-TWK-Innsbruck-HC-Bolzano-v-en-gb-xox-.php

http://www.ectp.org/mdt/video-HC-TWK-Innsbruck-HC-Bolzano-v-en-gb-aea-.php

http://elta.actiup.com/mbu/Video-Baskonia-Alba-Berlin-v-en-gb-1evr30122020-20.php

http://molos.bodasturias.com/jph/video-EHC-Eisbaren-Berlin-Iserlohn-Roosters-v-en-gb-spq30122020-.php

http://molos.bodasturias.com/jph/Video-Lausanne-HC-Geneve-Servette-HC-v-en-gb-cvx-.php

http://molos.bodasturias.com/jph/v-ideos-Lausanne-HC-Geneve-Servette-HC-v-en-gb-mgu-.php

http://molos.bodasturias.com/jph/v-ideos-Lausanne-HC-Geneve-Servette-HC-v-en-gb-eej-.php

http://elta.actiup.com/mbu/videos-saski-baskonia-v-alba-berlin-v-de-de-1rdx-10.php

http://main.ruicasa.com/tjb/videos-Tsmoki-Minsk-Cholet-Basket-v-en-gb-1kmv-.php

http://elta.actiup.com/mbu/video-saski-baskonia-v-alba-berlin-v-de-de-1dar-22.php

http://molos.bodasturias.com/jph/video-HC-Davos-EV-Zug-v-en-gb-xga30122020-.php

http://molos.bodasturias.com/jph/videos-HC-Davos-EV-Zug-v-en-gb-jvl30122020-.php

http://www.ectp.org/mdt/Video-HC-Ambri-Piotta-SC-Bern-v-en-gb-tha-.php

http://elta.actiup.com/mbu/video-saski-baskonia-v-alba-berlin-v-de-de-1dde-12.php

http://www.ectp.org/mdt/Video-HC-Ambri-Piotta-SC-Bern-v-en-gb-sxi-.php

http://molos.bodasturias.com/jph/v-ideos-HC-Davos-EV-Zug-v-en-gb-sna-.php

http://elta.actiup.com/mbu/video-saski-baskonia-v-alba-berlin-v-de-de-1cly-22.php

that I am a part of, GrowthClub, angels came knocking very early on. We hesitated. Nonetheless, with GrowthClub we took very little pre-seed money as a buffer and because we wanted to get on board a couple of angel investors as advisors. The difference in comparison to a typical case is that we didn’t spend 6 months full-time looking for investors, the equity we gave up was quite small, and we are watching our profit and loss very closely.

You will find most articles use configmap and secret objects in Kubernetes. Our learning is that it can get you started but we found it barely enough for our use-cases. Using configmap with existing services comes at a certain cost. Configmap can be mounted into pods in a certain way — using environment variables is the most common way. If you have a ton of legacy microservices that read configuration from files rendered by a configuration management tool such as Puppet, Chef or Ansible, you will have to redo configuration handling in all your code bases to now read from environment variables. We didn’t find enough reason to do this where it made sense. Also, a change in configuration or secret means that you will have to redeploy your deployment by patching it. This would be additional imperative orchestration of kubectl commands.

Out-of-the-box Kubernetes is never enough, for almost anyone. It’s a great playground to learn and explore. But you are most likely going to need more infrastructural components on top and tie them well together as a solution for applications to make it more meaningful for your developers. Often this bundle of Kubernetes with additional infrastructural components and policies is called Internal Kubernetes Platform. This is as an extremely useful paradigm and there are several ways to extend Kubernetes.

Setting up a basic cluster is perhaps not as difficult. We were able to get up our first cluster running within a week. Most issues happen when you start deploying your workloads. From tuning cluster autoscaler to provisioning resources at the right time to configuring the network correctly for the right performance, you have to do research and configure it all yourself. Defaults don’t work most of the time (or at least they didn’t work for us back then) for production.



Catagory :general