How I got selected for the LFX Mentorship Program

LFX Mentorship (previously known as Community Bridge) is a platform developed by the Linux Foundation, which promotes and accelerates the adoption, innovation, and sustainability of open-source software.

LFX Mentorship is actively used by the Cloud Native Computing Foundation(CNCF) as a mentorship platform across the CNCF projects

Program Schedule

2022 — Fall Term — September 1st – Nov 30th

2022 — Summer Term — June 1st – August 31st (My Term)

2022 — Spring Term — March 1st – May 31st

How to Apply

You have to write the Cover Letter and mention all the points about why you are interested in the projects and any previous work you have done or not and what you expect from the project etc.

Tip: Start contributing early and talk to the maintainers about your interests in the program and start to discuss the issue/feature you are going to work on.

My Project

My Project is Cluster API Provider for GCP(CAPG). It is a CNCF Project that helps manage the Kubernetes cluster in the Google Cloud Platform. Currently, another provider Cluster API Provider for AWS(CAPA), Cluster API Provider for Azure(CAPZ) has the support for taking advantage of GPU in their cluster but CAPG doesn’t have so Me and Subhasmita my co-mentee will work on the project to add support for GPU in the CAPG.

My Mentors

My Co-Mentee

Well, my journey would be a little monotonous if I didn’t have a co-mentee. It makes my work a little interesting because when we are both stuck on anything we hope on a call and discuss things. Also the weekly work we divide each other and teach each other what we have learned.

How It All Started

I didn’t have any plan to do LFX from the beginning. I started my journey with CAPG for GSoC”22. I applied for the same project and the same feature in the GSoC but that didn’t happen because the project didn’t get selected in the GSoC eventually all the applications to the project got rejected as well. So I talked to the maintainer Richard and told them that can I work in the GPU work as I was very interested in it. He told me that there is still hope in the LFX Mentorship and he opened an application there and I applied there. And then I got selected for the LFX Mentorship 🎉

How It Is Going

I was a little bit worried about how I will work on a big project like this where there are thousands of lines of code and me just a written a project with a max of 500 lines. But I am amazed how the maintainers made my journey very easy and got me onboarded with the introduction to the project for a couple of weeks and gave me small tasks of trying things out and asking a question if I am stuck at any point.

Next Steps:

I will start the GPU work the next week with Subhasmita and keep contributing to the project in the future.

Create a managed cluster using Cluster API Provider for Google Cloud Platform (CAPG)

In the previous blog, I explained how to create and manage Kubernetes with cluster API locally with the help of docker infrastructure.

In this blog, I will explain how to create and manage the k8s with Cluster API in the google cloud.

Note – Throughout the blog, I will use Kubernetes version 1.22.9 and it is recommended to use the version of our OS image created by the image builder. You can check from kubernetes.json and use that.

Step 1 –

  • Create the kind cluster –
kind create cluster --image kindest/node:v1.22.9 --wait 5m

Step 2 –

Follow image builder for GCP steps and build an image.

Step 3 –

  • Export the following env variables – (reference)
export GCP_PROJECT_ID=<YOUR PROJECT ID>
export GOOGLE_APPLICATION_CREDENTIALS=<PATH TO GCP CREDENTIALS>
export GCP_B64ENCODED_CREDENTIALS=$( cat /path/to/gcp-credentials.json | base64 | tr -d '\n' )

export CLUSTER_TOPOLOGY=true
export GCP_REGION="us-east4"
export GCP_PROJECT="<YOU GCP PROJECT NAME>"
export KUBERNETES_VERSION=1.22.9
export IMAGE_ID=projects/$GCP_PROJECT/global/images/<IMAGE ID>
export GCP_CONTROL_PLANE_MACHINE_TYPE=n1-standard-2
export GCP_NODE_MACHINE_TYPE=n1-standard-2
export GCP_NETWORK_NAME=default
export CLUSTER_NAME=test

Step 4 –

setup the network in this example we are using the default network so we will create some router/nats for our workload cluster to have internet access.

gcloud compute routers create "${CLUSTER_NAME}-myrouter" --project="${GCP_PROJECT}" --region="${GCP_REGION}" --network="default"

gcloud compute routers nats create "${CLUSTER_NAME}-mynat" --project="${GCP_PROJECT}" --router-region="${GCP_REGION}" --router="${CLUSTER_NAME}-myrouter" --nat-all-subnet-ip-ranges --auto-allocate-nat-external-ips

Step 5 –

  • Initialize the infrastructure
clusterctl init --infrastructure gcp
  • Generate the workload cluster config and apply it
clusterctl generate cluster $CLUSTER_NAME --kubernetes-version v1.22.9 > workload-test.yaml

kubectl apply -f workload-test.yaml
  • View the cluster and its resources
$ clusterctl describe cluster $CLUSTER_NAME
NAME                                                               READY  SEVERITY  REASON                 SINCE  MESSAGE
/test                                                              False  Info      WaitingForKubeadmInit  5s
├─ClusterInfrastructure - GCPCluster/test
└─ControlPlane - KubeadmControlPlane/test-control-plane            False  Info      WaitingForKubeadmInit  5s
  └─Machine/test-control-plane-x57zs                               True                                    31s
    └─MachineInfrastructure - GCPMachine/test-control-plane-7xzw2
  • Check the status of the control plane
$ kubectl get kubeadmcontrolplane
NAME                 CLUSTER   INITIALIZED   API SERVER AVAILABLE   REPLICAS   READY   UPDATED   UNAVAILABLE   AGE    VERSION
test-control-plane   test                                           1                  1         1             2m9s   v1.22.9

Note – The controller plane won’t be ready until the next step when I install the CNI (Container Network Interface).

Step 6 –

  • Get the kubeconfig for the workload cluster
$ clusterctl get kubeconfig $CLUSTER_NAME > workload-test.kubeconfig
  • Apply the cni
kubectl --kubeconfig=./workload-test.kubeconfig \
  apply -f https://docs.projectcalico.org/v3.20/manifests/calico.yaml
  • Wait a bit and you should see this when getting the kubeadmcontrolplane
$ kubectl get kubeadmcontrolplane
NAME                 CLUSTER   INITIALIZED   API SERVER AVAILABLE   REPLICAS   READY   UPDATED   UNAVAILABLE   AGE     VERSION
test-control-plane   test      true          true                   1          1       1         0             6m33s   v1.22.9


$ kubectl get nodes --kubeconfig=./workload-test.kubeconfig
NAME                       STATUS   ROLES                  AGE   VERSION
test-control-plane-7xzw2   Ready    control-plane,master   62s   v1.22.9

Step 7 –

  • Edit the MachineDeployment in the workload-test.yaml it has 0 replicas add the replicas you want to have your nodes, in this case, we used 2. Apply the workload-test.yaml
$ kubectl apply -f workload-test.yaml
  • After a few minutes, you should see something like this –
$ clusterctl describe cluster $CLUSTER_NAME
NAME                                                               READY  SEVERITY  REASON  SINCE  MESSAGE
/test                                                              True                     15m
├─ClusterInfrastructure - GCPCluster/test
├─ControlPlane - KubeadmControlPlane/test-control-plane            True                     15m
│ └─Machine/test-control-plane-x57zs                               True                     19m
│   └─MachineInfrastructure - GCPMachine/test-control-plane-7xzw2
└─Workers
  └─MachineDeployment/test-md-0                                    True                     10m
    └─2 Machines...                                                True                     13m    See test-md-0-68bd55744b-qpk67, test-md-0-68bd55744b-tsgf6

$ kubectl get nodes --kubeconfig=./workload-test.kubeconfig
NAME                       STATUS   ROLES                  AGE   VERSION
test-control-plane-7xzw2   Ready    control-plane,master   21m   v1.22.9
test-md-0-b7766            Ready    <none>                 17m   v1.22.9
test-md-0-wsgpj            Ready    <none>                 17m   v1.22.9

Yaaa! Now we have a Kubernetes cluster in the GCP with 1 control pannel with 2 worker nodes.

Step 8 –

Delete what you have created –

$ kubectl delete cluster $CLUSTER_NAME

$ gcloud compute routers nats delete "${CLUSTER_NAME}-mynat" --project="${GCP_PROJECT}" \
    --router-region="${GCP_REGION}" --router="${CLUSTER_NAME}-myrouter"

$ gcloud compute routers delete "${CLUSTER_NAME}-myrouter" --project="${GCP_PROJECT}" \
    --region="${GCP_REGION}"

$ kind delete cluster

Advantages of Golang sync.RWMutex over sync.Mutex

First of all, let’s understand what mutex is and why we use it. Mutex is a locking mechanism that protects shared data in a multi-threaded program where multiple threads are accessing the data concurrently. If we don’t use mutex then race conditions might happen in the program that will lead to inconsistent data throughout the program.

There are two types of Mutex in Golang –

  • sync.Mutex
    It protects the shared data both in reading & writing. This means if one thread is reading/writing another thread can’t read/write into the data. And if there is multiple thread reading then the read will happen one by one by each thread.
package main

import (
	"fmt"
	"sync"
	"time"
)

type SyncData struct {
	lock sync.Mutex
	wg   sync.WaitGroup
}

func main() {
	// m := map[int]int{}

	var sc SyncData

	sc.wg.Add(7)

	go readLoop(&sc)
	go readLoop(&sc)
	go readLoop(&sc)
	go readLoop(&sc)
	go writeLoop(&sc)
	go writeLoop(&sc)
	go writeLoop(&sc)

	sc.wg.Wait()
}

func writeLoop(sc *SyncData) {
	sc.lock.Lock()
	time.Sleep(1 * time.Second)
	fmt.Println("Write lock")
	fmt.Println("Write unlock")
	sc.lock.Unlock()
	sc.wg.Done()
}

func readLoop(sc *SyncData) {
	sc.lock.Lock()
	time.Sleep(1 * time.Second)
	fmt.Println("Read lock")
	fmt.Println("Read unlock")
	sc.lock.Unlock()
	sc.wg.Done()
}

Playground

Here you can see the write will block both read/write and the read will block the write as well as read. [E.G – You can see the delay between the read print statements]

  • Sync.RWMutex
    Now we know if the data is the same and the system is read-heavy it is ok to allow multiple threads to read from the data as there won’t be any conflict. So we use RWMutex instead where the idea is any number of readers can acquire the read lock at the same time but only one writer will be able to acquire the write lock at a time.
package main

import (
	"fmt"
	"sync"
	"time"
)

type SyncData struct {
	lock sync.RWMutex
	wg   sync.WaitGroup
}

func main() {
	// m := map[int]int{}

	var sc SyncData

	sc.wg.Add(7)

	go readLoop(&sc)
	go readLoop(&sc)
	go readLoop(&sc)
	go readLoop(&sc)
	go writeLoop(&sc)
	go writeLoop(&sc)
	go writeLoop(&sc)

	sc.wg.Wait()
}

func writeLoop(sc *SyncData) {
	sc.lock.Lock()
	time.Sleep(1 * time.Second)
	fmt.Println("Write lock")
	fmt.Println("Write unlock")
	sc.lock.Unlock()
	sc.wg.Done()
}

func readLoop(sc *SyncData) {
	sc.lock.RLock()
	time.Sleep(1 * time.Second)
	fmt.Println("Read lock")
	fmt.Println("Read unlock")
	sc.lock.RUnlock()
	sc.wg.Done()
}

Playground

Here you can see the write is blocking read & write but read is not blocking any read. Multiple threads is able to read at the same time. [E.G. – You can see the delay is write print statement but you won’t see any delay in read print statement]