Learning: Kubernetes – Deployments & StatefulSet

Deployments

Deployments are the way we manage pods in k8s. We specify all possible information about the pods like which version image it is going to pick and how many replicas of the pod will be there.

  • Properties
    • The spec.selector specify which pod it needs to manage.
    • When we update a deployment, it first creates a new pod, deletes an old pod, and makes sure that 125% of the desired number of pods is available at any time.
  • Rollout to a Previous Version When rolling out to a previous version we just use – kubectl rollout undo deployment/nginx-deployment When rolling out to another previous version we use – kubectl rollout undo deployment/nginx-deployment --to-revision=2

StatefulSet

Just like we manage the stateless applications with deployments we work with stateful applications with StatefulSet.

  • Properties
    • The StatefulSet cannot be created/deleted at the same time
    • can’t be accessed randomly
    • The replica set here is not identical.
    • Each pod gets a unique identifier in increasing order and these are required while rescheduling.
    • Each pod has its own physical store.
    • There is a master pod that is only allowed to change data.
    • All the slave pods sync with the master pod in order to achieve data consistency.
    • When a new pod joins the replica set it first clones all the data from one of the slave pods and after that starts to sync.
  • StatefulSets are valuable for applications that require one or more of the following.
    • Stable, unique network identifiers.
    • Stable, persistent storage.
    • Ordered, graceful deployment and scaling.
    • Ordered, automated rolling updates.
  • Data Persistence If a pod dies then all its data will be lost. So in order to counter this, we use persistent volume attached to every pod.
    • The storage has all the synchronized data with the pod’s state data.
    • When a pod gets replaced the persistent volume gets reattached to the pod and the state of the pod gets resumed.

What is System Call

System Call – It is the interface between the userspace program and kernel program to requests for resources.

Now why we need system calls-

  • Reading and writing from files demand system calls.
  • If a file system wants to create or delete files, system calls are required.
  • System calls are used for the creation and management of new processes.
  • Network connections need system calls for sending and receiving packets.
  • Access to hardware devices like scanner, printer, need a system call.

Here are the five types of System Calls in OS:

  • Process Control – This system call deals with process creation and termination, wait & signal events and allocate and free memory.
  • File Management – This deals with the file manipulation like create, update, delete, read, write and add attributes to the file.
  • Device Management – This deals with the device buffers like reading and writing as well as adding and removing logical devices.
  • Information Maintenance – It handle the data transfer between user and OS kernel.
  • Communications – This is used for inter process communication. Create and delete communication connections, send and receive messages etc.

Learning: Kubernetes – Persistent Volume & Persistent Volume Claim

Volume – Volume in Kubernetes can be thought of as a directory that can be accessed by containers in the pod. Volume helps persists the data even if the pod restarts.

  • PV
    • A Persistent Volume (PV) is a piece of storage in the cluster.
    • It is a cluster-level resource like a pod and doesn’t have any namespace.
    • It is been manually provisioned by an administrator, or dynamically provisioned by Kubernetes using a StorageClass.
  • PVC
    • A PersistentVolumeClaim (PVC) is a request for storage by a user that can be fulfilled by a PV.
    • Persistent Volumes and PersistentVolumeClaim are independent of Pod lifecycles and preserve data through restarting, rescheduling, and even deleting Pods.
  • Access Modes
    • ReadWriteOnce – It is used when we allow only one node to read & write on the volume. Multiple pods running on the same node can access the volume.
    • ReadOnlyMany – It is used when we allow read access to many pods.
    • ReadWriteMany – It is used when we allow read & write access to many nodes.
    • ReadWriteOncePod – It is used when we allow only one pod in a node for reading & writing.

Learning: Kubernetes – Container Runtime Interface & Garbage Collection

Container Runtime Interface

The Container Runtime Interface (CRI) is the primary protocol for the communication between the kubelet and Container Runtime.

Container Runtime – It is the software that helps run & manage containers in a host operating system. There are a number of Container runtimes in the market from Docker, runC, containerd, etc.

So in order to make an abstraction over all the container runtime supported by the Kubernetes the community has introduced a new concept called CRI(Container Runtime Interface) that talks to the container runtime.

The kubelet talks to the Container Runtime Interface(CRI) using a gRPC framework where kubelet is the client and CRI is the server.

Garbage Collection

It is term that k8s use to clean up the cluster resource.

  • Owner & Dependents In k8s there are some objects that are dependent on others. So k8s clean up the related object before deleting the object.
  • Cascading Deletion k8s deletes an object that no longer has owner references. Like the pods left after deleting the ReplicaSet.
    • Foreground Cascading Deletion –
      • The object we are trying to delete goes in a progressive state.
      • The Kubernetes API server sets the object’s metadata. deletion timestamp field to the time the object was marked for deletion.
      • The Kubernetes API server also sets the metadata. finalizers field to foregroundDeletion.
      • After going into the in-progress state the controller deletes all the dependent and removes the parent object.
    • Background Cascading Deletion –
      • Here the k8s deletes the owner object immediately.
      • Then the controller clean up the dependent objects.

Learning: Kubernetes – Why We Need Pod Abstraction Above Containers

I have discussed pods in the previous blog. Now in the short, a container is a standard unit of software that packages up code and all its dependencies in a virtualized environment that has its own file system.

As nodes are the VM or Physical Machine we could have run the container inside it without having the pod abstraction. But there will be some major problems that will arise in terms of managing the cluster and that is networking.

As we all know the container application runs in a specific port and more than one application can’t occupy a port. So if you need two containers of the same application running in a node then the same application needs to run in a different port and connection between them will be very messy.

And that is why Kubernetes solves the problem with pod abstraction. Each pod has a unique network namespace. This means each pod will have its own virtual ethernet. It’s like the pod is a small VM inside the node. And now each pod will have the application container running with the same port and there will be no conflict because all the containers running in self-contained isolated machines.

Now suppose a pod has more than one container(The main container and a helper container) then the container inside the pod will communicate with each other using the localhost.

Create a GraphQL API using Golang – Part 1

For the past couple of days, I have been tinkering with GraphQL and followed an awesome blog post, and created my own GraphQL API. But the blog post only contains Creating Links and Getting Links. I understand all the components and started extending the application to support the Update & Delete as well.

You should complete this and follow my blog to extend the app. Let’s start –

Get A Single Link

At first, add a query in the GraphQL Schema – graph/schema.graphqls

type Query {
  links: [Link!]!
  link(id: ID!): Link!
}

Then run the $ go run github.com/99designs/gqlgen generate

You will see a resolver being created in schema.resolvers.go with the below function signature –

func (r *queryResolver) Link(ctx context.Context, id string) (*model.Link, error) {

Now go to the internal/links/links.go and add a Get method to get the Link with respect to the id from the database.

func Get(id string) Links {
	var link Links
	stmt, err := database.Db.Prepare("SELECT ID, Title, Address FROM Links WHERE ID=?")
	if err != nil {
		log.Fatal(err)
	}
	defer stmt.Close()

	err = stmt.QueryRow(id).Scan(&link.ID, &link.Title, &link.Address)
	if err != nil {
		log.Fatal(err)
	}
	return link
}

Now it’s time for the resolver to come in picture –

func (r *queryResolver) Link(ctx context.Context, id string) (*model.Link, error) {
	link := links.Get(id)
	return &model.Link{
		ID:      link.ID,
		Title:   link.Title,
		Address: link.Address,
	}, nil
}

Update A Link

Now is the time to update an existing link. Now this time as we are writing into the database we have to use mutations.

type Mutation {
   updateLink(id: ID!, input: NewLink!): Link!
}

Now let’s generate the same using $ go run github.com/99designs/gqlgen generate

And add the Update method in the links.go –

func (link Links) Update(id string) int64 {
	stmt, err := database.Db.Prepare("UPDATE Links SET Title=? , Address=? WHERE ID=?")
	if err != nil {
		log.Fatal(err)
	}
	defer stmt.Close()

	res, err := stmt.Exec(link.Title, link.Address, id)
	if err != nil {
		log.Fatal(err)
	}

	rowsAffected, err := res.RowsAffected()
	if err != nil {
		log.Fatal(err)
	}
	return rowsAffected
}

Here we are taking the id input and doing an update operation on to it and returning the affected rows count.

Now it’s time for the resolvers –

func (r *mutationResolver) UpdateLink(ctx context.Context, id string, input model.NewLink) (*model.Link, error) {
	link := links.Links{
		Title:   input.Title,
		Address: input.Address,
	}
	rowsAffected := link.Update(id)
	if rowsAffected == 0 {
		return nil, errors.New("zero rows affected")
	}
	return &model.Link{
		ID:      id,
		Title:   link.Title,
		Address: link.Address,
	}, nil
}

Here we are taking the GraphQL input and doing an update operation on that and returning the updated value with an in-between affected rows check for 0.

Delete A Link

Now as usual add the mutation first in the schema –

type Mutation {
    deleteLink(id: ID!): String!
}

Now let’s generate the same using $ go run github.com/99designs/gqlgen generate

Now add the code in the links.go to perform the delete operation –

func Delete(id string) int64 {
	stmt, err := database.Db.Prepare("DELETE FROM Links WHERE ID=?")
	if err != nil {
		log.Fatal(err)
	}
	defer stmt.Close()

	res, err := stmt.Exec(id)
	if err != nil {
		log.Fatal(err)
	}

	rowsAffected, err := res.RowsAffected()
	if err != nil {
		log.Fatal(err)
	}
	return rowsAffected
}

Here we run the delete query and get the rows affected count and return it.

Now it’s time to resolve it –

func (r *mutationResolver) DeleteLink(ctx context.Context, id string) (string, error) {
	rowsAffected := links.Delete(id)
	if rowsAffected == 0 {
		return "", errors.New("zero rows affected")
	}
	return fmt.Sprintf("%v rows affected", rowsAffected), nil
}

Here we call the delete with the desired id string as an input and if got deleted successfully then we return a string with “<number> rows affected”.

Create a key Value storage using Golang – Part 2

In the previous blog I have discussed about how to make a key-value storage with a in memory storage. Now I am going to discuss about how you can extend this to use a file system storage.

Let’s first define our structure which is going to hold the information about the storage –

type DiskFS struct {
	FS             filesystem.Fs
	RootFolderName string
}

Now for this project we are going to create our own filesystem implementation we can use afero but I decided to use my own implementation for more learning.

First create a directory in the root folder called filesystem and create a file called fs.go and write the following code

package filesystem

import (
	"io"
	"os"
)

type FileSystem struct {
	Fs
}

type File interface {
	io.Closer
	io.Reader
	io.ReaderAt
	io.Seeker
	io.Writer
	io.WriterAt

	Name() string
	Readdir(count int) ([]os.FileInfo, error)
	Stat() (os.FileInfo, error)
	Sync() error
	WriteString(s string) (ret int, err error)
}

type Fs interface {
	Create(name string) (File, error)
	Mkdir(name string, perm os.FileMode) error
	Open(name string) (File, error)
	OpenFile(name string, flag int, perm os.FileMode) (File, error)
	Stat(name string) (os.FileInfo, error)
	Remove(name string) error
}

Now as we are going to use os module to implement the filesystem create a file called osfs.go and write the below code

package filesystem

import (
	"os"
)

type OsFs struct {
	Fs
}

// Return a File System for OS
func NewOsFs() Fs {
	return &OsFs{}
}

func (OsFs) Create(name string) (File, error) {
	file, err := os.Create(name)
	if err != nil {
		return nil, err
	}
	return file, nil
}

func (OsFs) Open(name string) (File, error) {
	file, err := os.Open(name)
	if err != nil {
		return nil, err
	}
	return file, err
}

func (OsFs) OpenFile(name string, flag int, perm os.FileMode) (File, error) {
	file, err := os.OpenFile(name, flag, perm)
	if err != nil {
		return nil, err
	}
	return file, nil
}

func (OsFs) Mkdir(name string, perm os.FileMode) error {
	return os.Mkdir(name, perm)
}

func (OsFs) Stat(name string) (os.FileInfo, error) {
	return os.Stat(name)
}

func (OsFs) Remove(name string) error {
	return os.Remove(name)
}

Now let’s create utility functions to handle some situations. create a utils.go file and write the code

package filesystem

import (
	"os"
)

func DirExists(fs Fs, name string) (bool, error) {
	file, err := fs.Stat(name)
	if err == nil && file.IsDir() {
		return true, nil
	}
	if os.IsNotExist(err) {
		return false, nil
	}
	return false, err
}

func Exists(fs Fs, name string) (bool, error) {
	_, err := fs.Stat(name)
	if err == nil {
		return true, nil
	}
	if os.IsNotExist(err) {
		return false, nil
	}
	return false, err
}

func ReadDir(fs Fs, dirName string) ([]os.FileInfo, error) {
	dir, err := fs.Open(dirName)
	if err != nil {
		return nil, err
	}
	defer dir.Close()
	list, err := dir.Readdir(-1)
	if err != nil {
		return nil, err
	}
	return list, nil
}

func ReadFile(fs Fs, name string) ([]byte, error) {
	file, err := fs.Open(name)
	if err != nil {
		return nil, err
	}
	defer file.Close()
	data, err := os.ReadFile(name)
	if err != nil {
		return nil, err
	}
	return data, nil
}

Ok so our filesystem is done. Now it’s time to write the code for file system based storage implementation. Write the below code inside records.go file

// Return the Disk structure file system
func NewDisk(rootFolder string) *DiskFS {
	diskFs := filesystem.NewOsFs()
	ok, err := filesystem.DirExists(diskFs, rootFolder)
	if err != nil {
		log.Fatalf("Dir exists: %v", err)
	}
	if !ok {
		err := diskFs.Mkdir(rootFolder, os.ModePerm)
		if err != nil {
			log.Fatalf("Create dir: %v", err)
		}
	}
	return &DiskFS{FS: diskFs, RootFolderName: rootFolder}
}

// Store key, value in the file system
func (d *DiskFS) Store(key, val string) {
	file, err := d.FS.Create(d.RootFolderName + "/" + key)
	if err != nil {
		log.Fatalf("Create file: %v", err)
	}
	defer file.Close()

	_, err = file.Write([]byte(val))
	if err != nil {
		log.Fatalf("Writing file: %v", err)
	}
}

func (d *DiskFS) List() map[string]string {
	m := make(map[string]string, 2)
	dir, err := filesystem.ReadDir(d.FS, d.RootFolderName)
	if err != nil {
		log.Fatalf("Error reading the directory: %v", err)
	}

	for _, fileName := range dir {
		content, err := filesystem.ReadFile(d.FS, d.RootFolderName+"/"+fileName.Name())
		if err != nil {
			log.Fatalf("Error reading the file: %v", err)
		}
		m[fileName.Name()] = string(content)
	}
	return m
}

func (d *DiskFS) Get(key string) (string, error) {
	ok, err := filesystem.Exists(d.FS, d.RootFolderName+"/"+key)
	if err != nil {
		log.Fatalf("File exist: %v", err)
	}

	if ok {
		file, err := filesystem.ReadFile(d.FS, d.RootFolderName+"/"+key)
		if err != nil {
			log.Fatalf("Error reading the file: %v", err)
		}
		return string(file), nil
	}
	return "", errors.New("key not found")
}

func (d *DiskFS) Delete(key string) error {
	ok, err := filesystem.Exists(d.FS, d.RootFolderName+"/"+key)
	if err != nil {
		log.Fatalf("File exist: %v", err)
	}
	if ok {
		err = d.FS.Remove(d.RootFolderName + "/" + key)
		if err != nil {
			log.Fatalf("Delete file err: %v", err)
		}
		return nil
	}
	return errors.New("key not found")
}

Now if you run the app with -storage-type=disk flag you can access all the file system based operation and you can see a directory gets created called storage and inside it the file created and in the file the content is the value.

In the next part I am going to write the tests for the application.

Learning: Kubernetes – Controller & Cloud Controller

Controller

k8s controller is like a thermostat in a room. It checks the current temperature and maintains the temperature by turning off & on the switch. Same here the k8s controller checks the current state and verify that if the current state is equal to the desired state or not. If the current state is not matched with the desired state it makes the necessary changes and brings to the desired state.

Types of the controller –

  • ReplicaSet – It is responsible for maintaining the set desired number of pods.
  • Deployment – Deployment is the most common way to get your app on Kubernetes. It maintains a ReplicaSet with the desired configuration.
  • StatefulSet – A StatefulSet is used to manage stateful applications with persistent storage.
  • Job – A Job creates one or more short-lived Pods and expects them to successfully terminate.
  • CronJob – A CronJob creates Jobs on a schedule.
  • DaemonSet – A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them.

Cloud Controller

The cloud controller manager lets you link your cluster into your cloud provider’s API, and separates out the components that interact with that cloud platform from components that only interact with your cluster.

  • Different functions
    • Node Controller –
      • It updates node objects when new servers are created in the cloud.
      • Annotating and labeling the node object with the cloud-specific information.
      • Obtain node hostname & network address.
      • Checks the node health. If the node has been deleted from the cloud then it also removes the node from the k8s cluster.
    • Route Controller – It configures the routes properly so the nodes on the k8s cluster can communicate with each other.
    • Service Controller – It interacts with the cloud provider’s API to set up a load balancer and other infrastructure components.

Learning: Kubernetes – ConfigMaps & Secrets

Every application has configuration data like API key, DB URL, DB user, DB password, etc. Yes, you can hardcode these data in your application but after some time it will be unmanageable. You need some kind of dynamic solution where you define all these data once and every component of your cluster can access these data.

Suppose our application DB URL changes for that we need to change every place where the URL is used. So, for this our application needs to be rebuilt again and we have to change the code inside the application. To encounter this situation we use ConfigMaps and Secrets for storing the application’s required data.

ConfigMaps – A ConfigMaps is an API object used to store non-confidential data in key-value pairs. Pods can consume ConfigMaps as environment variables, command-line arguments, or as configuration files in a volume.

Ex – ConfigMaps contains the URL of the database, user, and other non-credential data.

Secrets – A secret is similar to ConfigMaps, except a secret is used for sensitive information such as credentials. One of the main differences is that you have to explicitly tell kubectl to show you the contents of a secret.

  • It is encoded in base64.

Learning: Kubernetes – Labels & Selectors, Finalizers

Labels & Selectors

Labels – Labels are key-value pairs attached to pods, ReplicaSet & Services. They are used for identifying objects for pods and ReplicaSet. It can be added at the creation time or can be modified or added at the run time.

Some properties –

  • must be 63 characters or less (can be empty),
  • unless empty, must begin and end with an alphanumeric character ([a-z0-9A-Z]),
  • could contain dashes (“), underscores (_), dots (.), and alphanumeric between.

Selectors – Kubernetes API currently supports two type of selectors −

  • Equality-based selectors – We use = == != as equality selector.
environment = production
tier != frontend
  • Set-based selectors – It allows filtering of object in a set of values. There are three king of operator allowed in notin exists.
environment in (production, qa)
tier notin (frontend, backend)
partition
!partition

Finalizers

Deleting an object isn’t as simple as it looks because it involves a lot of conditional checks about the used resources. Finalizers are certain conditions be met before an object can be deleted.

When you run kubectl delete namespace/example k8s check the finalizers defined on that object.

  1. Issue a deletion command. – Kubernetes marks the object as pending deletion. This leaves the resource in the read-only “Terminating” state.
  2. Run each of the actions associated with the object’s Finalizers. – Each time a Finalizer action completes, that Finalizer is detached from the object, so it’ll no longer appear in the metadata.finalizers field.
  3. Kubernetes keeps monitoring the Finalizers attached to the object. – The object will be deleted once the metadata.finalizers field is empty, because all Finalizers were removed by the completion of their actions.