Saturday, January 9, 2021

Quick Implementation of a K8s operator using the operator SDK

The operator pattern was initially introduced by CoreOS in this article to automate (and consequently relieve them from) recurring operations performed by site reliability engineers. The idea is that of a control loop, monitoring the cluster state and performing actions upon changes, so that the state can be reconciled to that desired. In fact, there exists other control loops running within K8s. The kube-controller-manager is a daemon embedding those core control loops that keep K8s up and running. As such, operator controllers extend the behavior of K8s without modifying its code, by querying its API for changes on a custom resource. 

In this post we will quickly create a simple K8s operator. Though there exists multiple ways (see official K8s doc here) of writing one, e.g. by directly interacting with the rest api to monitor resources, we will be using the operator-sdk. A very nice ebook documenting the operator-sdk was made freely available by RedHat, here.


Installing the operator-sdk CLI

Please have a look at the github release page to download a suitable binary for your system.

After downloading, just move it to a target folder and add exec rights.

sudo chmod +x /usr/local/bin/operator-sdk

$ operator-sdk
Development kit for building Kubernetes extensions and tools.

Provides libraries and tools to create new projects, APIs and controllers.
Includes tools for packaging artifacts into an installer container.

Typical project lifecycle:

- initialize a project:

  operator-sdk init --domain example.com --license apache2 --owner "The Kubernetes authors"

- create one or more a new resource APIs and add your code to them:

  operator-sdk create api --group <group> --version <version> --kind <Kind>

Create resource will prompt the user for if it should scaffold the Resource and / or Controller. To only
scaffold a Controller for an existing Resource, select "n" for Resource. To only define
the schema for a Resource without writing a Controller, select "n" for Controller.

After the scaffold is written, api will run make on the project.

Usage:
  operator-sdk [flags]
  operator-sdk [command]

Examples:

  # Initialize your project
  operator-sdk init --domain example.com --license apache2 --owner "The Kubernetes authors"

  # Create a frigates API with Group: ship, Version: v1beta1 and Kind: Frigate
  operator-sdk create api --group ship --version v1beta1 --kind Frigate

  # Edit the API Scheme
  nano api/v1beta1/frigate_types.go

  # Edit the Controller
  nano controllers/frigate_controller.go

  # Install CRDs into the Kubernetes cluster using kubectl apply
  make install

  # Regenerate code and run against the Kubernetes cluster configured by ~/.kube/config
  make run


Available Commands:
  bundle      Manage operator bundle metadata
  cleanup     Clean up an Operator deployed with the 'run' subcommand
  completion  Generators for shell completions
  create      Scaffold a Kubernetes API or webhook
  edit        This command will edit the project configuration
  generate    Invokes a specific generator
  help        Help about any command
  init        Initialize a new project
  olm         Manage the Operator Lifecycle Manager installation in your cluster
  run         Run an Operator in a variety of environments
  scorecard   Runs scorecard
  version     Print the operator-sdk version

Flags:
  -h, --help      help for operator-sdk
      --verbose   Enable verbose logging

Creating a New Operator using Go Modules

$ export GO111MODULE=on

$ operator-sdk init --domain=example.com --repo=github.com/pilillo/example-operator

which creates the boilerplate code common to operators:


$ ls

$ Dockerfile Makefile PROJECT bin config go.mod go.sum hack main.go

For instance, this includes a Dockerfile to ship the operator as well as helper functions to automatically setup Prometheus monitoring metrics.

The main.go defines and starts a Manager, which is in charge for interacting with the cluster and register the Scheme for the custom resource API definition, set up controllers and web hooks.

package main

import (
"flag"
"os"

// Import all Kubernetes client auth plugins (e.g. Azure, GCP, OIDC, etc.)
// to ensure that exec-entrypoint and run can make use of them.
_ "k8s.io/client-go/plugin/pkg/client/auth"

"k8s.io/apimachinery/pkg/runtime"
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/healthz"
"sigs.k8s.io/controller-runtime/pkg/log/zap"
// +kubebuilder:scaffold:imports
)

var (
scheme   = runtime.NewScheme()
setupLog = ctrl.Log.WithName("setup")
)

func init() {
utilruntime.Must(clientgoscheme.AddToScheme(scheme))

// +kubebuilder:scaffold:scheme
}

func main() {
var metricsAddr string
var enableLeaderElection bool
var probeAddr string
flag.StringVar(&metricsAddr, "metrics-bind-address", ":8080", "The address the metric endpoint binds to.")
flag.StringVar(&probeAddr, "health-probe-bind-address", ":8081", "The address the probe endpoint binds to.")
flag.BoolVar(&enableLeaderElection, "leader-elect", false,
"Enable leader election for controller manager. "+
"Enabling this will ensure there is only one active controller manager.")
opts := zap.Options{
Development: true,
}
opts.BindFlags(flag.CommandLine)
flag.Parse()

ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts)))

mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
Scheme:                 scheme,
MetricsBindAddress:     metricsAddr,
Port:                   9443,
HealthProbeBindAddress: probeAddr,
LeaderElection:         enableLeaderElection,
LeaderElectionID:       "78181a6d.example.com",
})
if err != nil {
setupLog.Error(err, "unable to start manager")
os.Exit(1)
}

// +kubebuilder:scaffold:builder

if err := mgr.AddHealthzCheck("health", healthz.Ping); err != nil {
setupLog.Error(err, "unable to set up health check")
os.Exit(1)
}
if err := mgr.AddReadyzCheck("check", healthz.Ping); err != nil {
setupLog.Error(err, "unable to set up ready check")
os.Exit(1)
}

setupLog.Info("starting manager")
if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
setupLog.Error(err, "problem running manager")
os.Exit(1)
}
}


Implementing a Custom Resource Definition

A Custom Resource Definition allows for the definition of a schema for custom resources and therefore to extend the Kubernetes API to introduce new objects. Once a custom resource is defined, users can create and access its objects (i.e. using the basic K8s CRUD operations) using the usual kubectl command, as with any other K8s object.

In K8s terminology, there are 4 terms involved in the API definition: groups, versions, kinds and resources. An API group is a collection of related functionalities and has one or more versions, uniquely tracking changes (i.e. versions). Each versioned group contains one ore more API types, named kinds, whose behaviour may change across versions. On the contrary, a resource denotes the use of a kind in the API, since there is not necessarily a 1:1 mapping between kind and resource, given that the same kind may be returned by different resources. In CRDs, a kind corresponds to a single resource.

Following the Memcached example reported here and here, let's add an api:

$ operator-sdk create api \
    --group=cache \
    --version=v1alpha1 \
    --kind=Memcached
which does:

Create Resource [y/n]
y
Create Controller [y/n]
y
Writing scaffold for you to edit...
api/v1alpha1/memcached_types.go
controllers/memcached_controller.go
Running make:
$ make
/Users/pilillo/Documents/example-operator/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
go build -o bin/manager main.go

An api folder was created to contain the CRD definitions and similarly one was created for the controller:
$ ls
Dockerfile  Makefile    PROJECT     api         bin         config      controllers go.mod      go.sum      hack        main.go
$ ls api/v1alpha1
groupversion_info.go     memcached_types.go       zz_generated.deepcopy.go
$ ls controllers
memcached_controller.go suite_test.go

The api/v1alpha1/memcached_types.go contains the definition to marshal/unmarshal the CRD:
package v1alpha1

import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// EDIT THIS FILE!  THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required.  Any new fields you add must have json tags for the fields to be serialized.

// MemcachedSpec defines the desired state of Memcached
type MemcachedSpec struct {
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
// Important: Run "make" to regenerate code after modifying this file

// Foo is an example field of Memcached. Edit Memcached_types.go to remove/update
Foo string `json:"foo,omitempty"`
}

// MemcachedStatus defines the observed state of Memcached
type MemcachedStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make" to regenerate code after modifying this file
}

// +kubebuilder:object:root=true
// +kubebuilder:subresource:status

// Memcached is the Schema for the memcacheds API
type Memcached struct {
metav1.TypeMeta   `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`

Spec   MemcachedSpec   `json:"spec,omitempty"`
Status MemcachedStatus `json:"status,omitempty"`
}

// +kubebuilder:object:root=true

// MemcachedList contains a list of Memcached
type MemcachedList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items           []Memcached `json:"items"`
}

func init() {
SchemeBuilder.Register(&Memcached{}, &MemcachedList{})
}

As visible from the code, after any modification to the file a call to make generate should follow to recreate the code for the CRD.

❯ make generate
/Users/pilillo/Documents/example-operator/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."

Implementing the Operator logic

Let's have a look at the previously created memcached_controller.go:
package controllers

import (
"context"

"github.com/go-logr/logr"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"

cachev1alpha1 "github.com/pilillo/example-operator/api/v1alpha1"
)

// MemcachedReconciler reconciles a Memcached object
type MemcachedReconciler struct {
client.Client
Log    logr.Logger
Scheme *runtime.Scheme
}

// +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds/finalizers,verbs=update

// Reconcile is part of the main kubernetes reconciliation loop which aims to
// move the current state of the cluster closer to the desired state.
// TODO(user): Modify the Reconcile function to compare the state specified by
// the Memcached object against the actual cluster state, and then
// perform operations to make the cluster state reflect the state specified by
// the user.
//
// For more details, check Reconcile and its Result here:
// - https://pkg.go.dev/sigs.k8s.io/controller-runtime@v0.7.0/pkg/reconcile
func (r *MemcachedReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
_ = r.Log.WithValues("memcached", req.NamespacedName)

// your logic here

return ctrl.Result{}, nil
}

// SetupWithManager sets up the controller with the Manager.
func (r *MemcachedReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&cachev1alpha1.Memcached{}).
Complete(r)
}

The setupWithManager specifies the custom resources of kind  cachev1alpha1.Memcached as the primary one to watch for, whereas Owns(&appsv1.Deployment{}) can be added to also monitor Deployments created with the controller.

The automatically generated Reconcile function is the one defining the controller logic. This function is called every time a custom resource of kind Memcached is created, changed or deleted or upon error on the reconcile loop. Specifically, a ctrl.Request argument is provided to retrieve the actual custom resource:

memcached := &cachev1alpha1.Memcached{}
err := r.Get(ctx, req.NamespacedName, memcached)

Based on the returned error, the Request may be requeued and the reconcile triggered again:
// Reconcile successful - don't requeue
return ctrl.Result{}, nil
// Reconcile failed due to error - requeue
return ctrl.Result{}, err
// Requeue for any reason other than an error
return ctrl.Result{Requeue: true}, nil


Testing the operator

The operator can either be directly deployed to the cluster with:
$ make install
or
make deploy


or locally ran for testing purposes. For the latter we would simply run:
operator-sdk run --local

and stopped with a Ctrl+C.


Deploying the operator

The Makefile contains means for building and pushing a Docker image with the controller.
# Build the docker imagedocker-build: test	docker build -t ${IMG} .
# Push the docker imagedocker-push: docker push ${IMG}
Therefore:
export IMG=example/memcached-operator:v.0.0.1
And simply:
make docker-build
make docker-push

No comments:

Post a Comment