Application-Consistent Backups for Kubernetes Using CvTasks

To implement application-consistent backups for Kubernetes, you can execute tasks before and after taking volume snapshots.

Tasks are scripts/commands that run before or after certain job phases. During backups, the script files are copied and executed inside the application pods. The tasks can be written in any scripting language, such as bash, Perl, or Python. To use Perl or Python scripts, you must pre-install Perl or Python in the pod.

CvTasks Method vs Pre/Post-Script Method

Both the new CvTasks method described on this page and the old method of executing .prescript and .postscript files from access nodes (for Metallic, the access nodes must be on-premises) are supported. If both methods are configured, the new method is used.

Requirements

For a given Kubernetes cluster, the following CRD must be applied. The CRD can be obtained from Commvault / Kubernetes on GitHub.

  • kubectl create -k AppConsistency/deploy/

Custom Resources

The following custom resources are used:

  • CvTask:

    • Defines a task. A task represents a unit of execution.

    • Can be a single command, a script text, or a local script (residing in the Pod).

  • CvTaskSet:

    • Links a Kubernetes application to the tasks that need to be executed.

    • Prescribes an order in which these tasks will be executed.

CvTask

For task execution, CvTask has the following ways to execute, which can be specified in commandType:

  • Command: Direct command is executed on the Pod.

  • LocalScript: Local script that is present on the Pod is executed. (You must provide absolute paths.)

  • ScriptText: Custom script (in the command field of CvTask) is copied and executed inside the Pod.

CvTaskSet

CvTaskSet must include one of the following fields to function as a selector for CvTasks to be executed on the application pods:

  • appName

  • appNamespace

  • labelSelectors

CvTaskSet can have multiple tasks to be executed. Each task is mapped to a CvTask. So, if you want to execute multiple tasks for an application, you can create multiple CvTasks and add them to the tasks in CvTaskSet.

The following is a sample YAML for CvTaskSet. In this example, for all applications with the name wordpress in the namespace wordpress-mysql, the tasks cvtask001 and cvtask002 are executed in the same order.

apiVersion: k8s.cv.io/v1
kind: CvTaskSet
metadata:
  name: cvtaskset001
  namespace: cv-config
spec:
  appName: wordpress
  appNamespace: wordpress-mysql
  tasks:
  - cvTaskName: cvtask001
    cvTaskNamespace: cv-config
    id: testid0001
    isDisabled: false
    executionOrder: 1
  - cvTaskName: cvtask002
    cvTaskNamespace: cv-config
    id: testid0002
    isDisabled: false
    executionOrder: 2

Process Steps

  1. For each application that is configured as content in the application group, during the backup, Commvault searches CvTaskSets in the cv-config namespace for matching applications.

  2. If Commvault finds one or more matching CvTaskSets, Commvault executes the tasks that are specified in CvTaskSet on applications, in the specified order.

Database Detection Script

You can use the CVK8SDiscoverDB.sh script to detect MySQl and PostgreSQL database pods that run inside the Kubernetes cluster. The script generates CVTask and CVTaskSet custom resource YAML files to quiesce these databases. The YAML files are placed in the 'manifests' folder inside the script directory. You can also use the script to apply the YAML files to the cluster.

To get the script, go to DBDetection in Commvault / Kubernetes on GitHub.

Syntax for the Database Detection Script

CVK8SDiscoverDB.sh --namespace <name / all >  --db  <mysql / postgresql / all>

Examples for the Database Detection Script

The following command detects MySQL database pods that run under the 'test' namespace:

CVK8SDiscoverDB.sh --namespace test  --db  mysql

The following command detects PostgreSQL pods that run under the 'test1' and 'test2' namespaces:

CVK8SDiscoverDB.sh --namespace test1, test2--db postgresql

The following command detects MySQL and PostgreSQL pods that run under all namespaces:

CVK8SDiscoverDB.sh --namespace all --db all

Custom Resource Examples

For more examples, see AppConsistency in Commvault / Kubernetes on GitHub.

CvTask Example 1: Command

In this example:

  • The command mysql with arguments -hHOSTNAME, -uUSER, -pPASSWORD, DATABASE, and -e"FLUSH TABLES WITH READ LOCK;" is called before the volume snapshot is taken.

  • The command mysql with arguments -hHOSTNAME, -uUSER, -pPASSWORD, DATABASE, and -e"UNLOCK TABLES;" is called after the volume snapshot is taken.

    apiVersion: k8s.cv.io/v1
    kind: CvTask
    metadata:
      name: cvtask001
      namespace: cv-config
    spec:
      postBackupSnapshot:
        args:
        - -hHOSTNAME
        - -uUSER
        - -pPASSWORD
        - DATABASE
        - -e"UNLOCK TABLES;"
        command: mysql
        commandType: Command
      preBackupSnapshot:
        args:
        - -hHOSTNAME
        - -uUSER
        - -pPASSWORD
        - DATABASE
        - -e"FLUSH TABLES WITH READ LOCK;"
        command: mysql
        commandType: Command

CvTask Example 2: ScriptText

In this example:

  • The whole script after preBackupSnapshot is copied on to the application pod and executed before the volume snapshot is taken.

  • The whole script after postBackupSnapshot is copied on to the application pod and executed after the volume snapshot is taken.

    apiVersion: k8s.cv.io/v1
    kind: CvTask
    metadata:
      name: cvtask009-python-scripttext
      namespace: cv-config
    spec:
      preBackupSnapshot:
        args: []
        command: |
          #!/usr/local/bin/python
          import MySQLdb
          import os
          import time
          import datetime      
          dt=datetime.datetime.now().strftime("%I:%M%p on %B %d, %Y")
          file1 = open("/scripts/pre-freeze.log","a+" )
          try:
            conn = MySQLdb.connect ('localhost' , 'root' , 'password' )
            cur = conn.cursor()
            cur.execute ("select version()")
            data = cur.fetchone()
            file1.write (dt)
            file1.write ("-------------------------------------------\n")
            file1.write ("-------------------------------------------\n")
            file1.write ("\t MySQL version is %s: "%data)
            file1.write ("-------------------------------------------\n")
            file1.write ("-------------------------------------------\n")
          except:
            file1.write (dt)
            file1.write("\t unable to connect to MySQL server\n")
            file2 = open ('/tmp/freeze_snap.lock', 'w')
            file2.close()
          try:
            cur.execute (" flush tables with read lock ")
            file1.write (dt)
            file1.write ("\t using quiesce.py script - quiesce of database successful \n")
          except:
           file1.write(dt)
            file1.write( "\n unexpected error from MySQL, unable to do flush tables with read lock, Please check MySQL error logs for more info\n")
            while True:
              check = os.path.exists ("/tmp/freeze_snap.lock")
              if check == True:
                continue
              else:
                break
        commandType: ScriptText
      postBackupSnapshot:
        args: []
        command: |
          #!/usr/local/bin/python
          import MySQLdb
          import os
          import time
          import datetime      
          dt=datetime.datetime.now().strftime("%I:%M%p on %B %d, %Y")
          file1 = open("/scripts/post-thaw.log","a+" )
          try:
            os.remove('/tmp/freeze_snap.lock')
            time.sleep(2)
          except Exception, e:
            print e
          try:
            conn = MySQLdb.connect ('localhost' , 'root' , 'password' )
            cur = conn.cursor()
            cur.execute ("select version()")
            data = cur.fetchone()
            file1.write (dt)
            file1.write ("-------------------------------------------\n")
            file1.write ("-------------------------------------------\n")
            file1.write ("\t MySQL version is %s: "%data)
            file1.write ("-------------------------------------------\n")
            file1.write ("-------------------------------------------\n")
          except:
            file1.write (dt)
            file1.write("\t unable to connect to MySQL server\n")
          try:
            file1.write (dt)
            file1.write ("\t executing query to unquiesce the database \n")
            cur.execute ("unlock tables")
            file1.write (dt)
            file1.write ("\t Database is in unquiesce mode now \n")
          except:
            file1.write(dt)
            file1.write( "\n unexpected error from MySQL, unable to unlock tables. Please check MySql error logs for more info \n")
            cur.close()
            conn.close()
        commandType: ScriptText

CvTask Example 3: LocalScript

In this example:

  • The local script /root/scripts/mysqldb-actions with argument quiesce is called before the volume snapshot is taken.

  • The local script /root/scripts/mysqldb-actions with argument unquiesce is called after the volume snapshot is taken.

    apiVersion: k8s.cv.io/v1
    kind: CvTask
    metadata:
      name: cvtask003-local
      namespace: cv-config
    spec:
      postBackupSnapshot:
        args:
        - unquiesce
        command: /root/scripts/mysqldb-actions
        commandType: LocalScript
      preBackupSnapshot:
        args:
        - quiesce
        command: /root/scripts/mysqldb-actions
        commandType: LocalScript

CvTaskSet Example 1: appName

In this example, for applications in the namespace wordpress that have the name wordpress-mysql, task cvtask001 is executed, followed by cvtask002. Task cvtask003 is skipped because it is disabled.

apiVersion: k8s.cv.io/v1
kind: CvTaskSet
metadata:
  name: cvtaskset001
  namespace: cv-config
spec:
  appName: wordpress-mysql
  appNamespace: wordpress
  tasks:
  - cvTaskName: cvtask001
    cvTaskNamespace: cv-config
    executionOrder: 1
    id: custom-testcase-123
    isDisabled: false
  - cvTaskName: cvtask002
    cvTaskNamespace: cv-config
    executionOrder: 3
    id: custom-testcase-123
    isDisabled: false
  - cvTaskName: cvtask002
    cvTaskNamespace: cv-config
    executionOrder: 2
    id: custom-testcase-123
    isDisabled: true

CvTaskSet Example 2: labelSelectors

In this example, for applications in the namespace wordpress that have labels tier=frontend and app=database, tasks cvtask001 and cvtask002 are executed in the same order. Task cvtask003 is skipped because it is disabled.

apiVersion: k8s.cv.io/v1
kind: CvTaskSet
metadata:
  name: cvtaskset001
  namespace: cv-config
spec:
  labelSelectors:
    - -tier=frontend
    - -app=database
  appNamespace: wordpress
  tasks:
  - cvTaskName: cvtask001
    cvTaskNamespace: cv-config
    executionOrder: 1
    id: custom-testcase-123
    isDisabled: false
  - cvTaskName: cvtask002
    cvTaskNamespace: cv-config
    executionOrder:
    id: custom-testcase-123
    isDisabled: false
  - cvTaskName: cvtask003
    cvTaskNamespace: cv-config
    executionOrder: 2
    id: custom-testcase-123
    isDisabled: true

Loading...