Application-Consistent Backups for Kubernetes Using CvTasks

Updated

To implement application-consistent backups for Kubernetes, you can execute tasks before and after taking volume snapshots.

Tasks are scripts/commands that run before or after certain job phases. During backups, the script files are copied and executed inside the application pods. The tasks can be written in any scripting language, such as bash, Perl, or Python. To use Perl or Python scripts, you must pre-install Perl or Python in the pod.

CvTasks Method vs Pre/Post-Script Method

Both the new CvTasks method described on this page and the old method of executing .prescript and .postscript files from access nodes (for Metallic, the access nodes must be on-premises) are supported. If both methods are configured, the new method is used.

Pre-Requisites

For a given Kubernetes cluster, the following CRD must be applied. The CRD can be obtained from https://github.com/Commvault/Kubernetes/.

  • kubectl create -k AppConsistency/deploy/

Custom Resources

The following custom resources are used:

  • CvTask:

    • Defines a task. A task represents a unit of execution.

    • Can be a single command, a script text, or a local script (residing in the Pod).

  • CvTaskSet:

    • Links a Kubernetes application to the tasks that need to be executed.

    • Prescribes an order in which these tasks will be executed.

CvTask

For task execution, CvTask has the following ways to execute, which can be specified in commandType:

  • Command: Direct command is executed on the Pod.

  • LocalScript: Local script that is present on the Pod is executed. (You must provide absolute paths.)

  • ScriptText: Custom script (in the command field of CvTask) is copied and executed inside the Pod.

CvTaskSet

CvTaskSet must include one of the following fields to function as a selector for CvTasks to be executed on the application pods:

  • appName

  • appNamespace

  • labelSelectors

CvTaskSet can have multiple tasks to be executed. Each task is mapped to a CvTask. So, if you want to execute multiple tasks for an application, you can create multiple CvTasks and add them to the tasks in CvTaskSet.

The following is a sample YAML for CvTaskSet. In this example, for all applications with the name wordpress in the namespace wordpress-mysql, the tasks cvtask001 and cvtask002 are executed in the same order.

apiVersion: k8s.cv.io/v1

kind: CvTaskSet

metadata:

name: cvtaskset001

namespace: cv-config

spec:

appName: wordpress

appNamespace: wordpress-mysql

tasks:

- cvTaskName: cvtask001

cvTaskNamespace: cv-config

id: testid0001

isDisabled: false

executionOrder: 1

- cvTaskName: cvtask002

cvTaskNamespace: cv-config

id: testid0002

isDisabled: false

executionOrder: 2

Process Steps

  1. For each application configured as content in the application group, at the time of backup, Commvault looks in CvTaskSets in the cv-config namespace for matching applications.

  2. On finding one or more matching CvTaskSet,Commvault executes tasks given in CvTaskSet on applications, in the specified order.

Custom Resource Examples

For more examples, see AppConsistency in Commvault GitHub.

CvTask Example 1: Command

In this example:

  • The command mysql with arguments -hHOSTNAME, -uUSER, -pPASSWORD, DATABASE, and -e"FLUSH TABLES WITH READ LOCK;" is called before the volume snapshot is taken.

  • The command mysql with arguments -hHOSTNAME, -uUSER, -pPASSWORD, DATABASE, and -e"UNLOCK TABLES;" is called after the volume snapshot is taken.

    apiVersion: k8s.cv.io/v1

    kind: CvTask

    metadata:

    name: cvtask001

    namespace: cv-config

    spec:

    postBackupSnapshot:

    args:

    - -hHOSTNAME

    - -uUSER

    - -pPASSWORD

    - DATABASE

    - -e"UNLOCK TABLES;"

    command: mysql

    commandType: Command

    preBackupSnapshot:

    args:

    - -hHOSTNAME

    - -uUSER

    - -pPASSWORD

    - DATABASE

    - -e"FLUSH TABLES WITH READ LOCK;"

    command: mysql

    commandType: Command

CvTask Example 2: ScriptText

In this example:

  • The whole script after preBackupSnapshot is copied on to the application pod and executed before the volume snapshot is taken.

  • The whole script after postBackupSnapshot is copied on to the application pod and executed after the volume snapshot is taken.

    apiVersion: k8s.cv.io/v1

    kind: CvTask

    metadata:

    name: cvtask009-python-scripttext

    namespace: cv-config

    spec:

    preBackupSnapshot:

    args: []

    command: |

    #!/usr/local/bin/python

    import MySQLdb

    import os

    import time

    import datetime



    dt=datetime.datetime.now().strftime("%I:%M%p on %B %d, %Y")

    file1 = open("/scripts/pre-freeze.log","a+" )

    try:

    conn = MySQLdb.connect ('localhost' , 'root' , 'password' )

    cur = conn.cursor()

    cur.execute ("select version()")

    data = cur.fetchone()

    file1.write (dt)

    file1.write ("-------------------------------------------\n")

    file1.write ("-------------------------------------------\n")

    file1.write ("\t MySQL version is %s: "%data)

    file1.write ("-------------------------------------------\n")

    file1.write ("-------------------------------------------\n")

    except:

    file1.write (dt)

    file1.write("\t unable to connect to MySQL server\n")

    file2 = open ('/tmp/freeze_snap.lock', 'w')

    file2.close()

    try:

    cur.execute (" flush tables with read lock ")

    file1.write (dt)

    file1.write ("\t using quiesce.py script - quiesce of database successful \n")

    except:

    file1.write(dt)

    file1.write( "\n unexpected error from MySQL, unable to do flush tables with read lock, Please check MySQL error logs for more info\n")

    while True:

    check = os.path.exists ("/tmp/freeze_snap.lock")

    if check == True:

    continue

    else:

    break

    commandType: ScriptText

    postBackupSnapshot:

    args: []

    command: |

    #!/usr/local/bin/python

    import MySQLdb

    import os

    import time

    import datetime



    dt=datetime.datetime.now().strftime("%I:%M%p on %B %d, %Y")

    file1 = open("/scripts/post-thaw.log","a+" )

    try:

    os.remove('/tmp/freeze_snap.lock')

    time.sleep(2)

    except Exception, e:

    print e

    try:

    conn = MySQLdb.connect ('localhost' , 'root' , 'password' )

    cur = conn.cursor()

    cur.execute ("select version()")

    data = cur.fetchone()

    file1.write (dt)

    file1.write ("-------------------------------------------\n")

    file1.write ("-------------------------------------------\n")

    file1.write ("\t MySQL version is %s: "%data)

    file1.write ("-------------------------------------------\n")

    file1.write ("-------------------------------------------\n")

    except:

    file1.write (dt)

    file1.write("\t unable to connect to MySQL server\n")

    try:

    file1.write (dt)

    file1.write ("\t executing query to unquiesce the database \n")

    cur.execute ("unlock tables")

    file1.write (dt)

    file1.write ("\t Database is in unquiesce mode now \n")

    except:

    file1.write(dt)

    file1.write( "\n unexpected error from MySQL, unable to unlock tables. Please check MySql error logs for more info \n")

    cur.close()

    conn.close()

    commandType: ScriptText

CvTask Example 3: LocalScript

In this example:

  • The local script /root/scripts/mysqldb-actions with argument quiesce is called before the volume snapshot is taken.

  • The local script /root/scripts/mysqldb-actions with argument unquiesce is called after the volume snapshot is taken.

    apiVersion: k8s.cv.io/v1

    kind: CvTask

    metadata:

    name: cvtask003-local

    namespace: cv-config

    spec:

    postBackupSnapshot:

    args:

    - unquiesce

    command: /root/scripts/mysqldb-actions

    commandType: LocalScript

    preBackupSnapshot:

    args:

    - quiesce

    command: /root/scripts/mysqldb-actions

    commandType: LocalScript

CvTaskSet Example 1: appName

In this example, for applications in the namespace wordpress that have the name wordpress-mysql, task cvtask001 is executed, followed by cvtask002. Task cvtask003 is skipped because it is disabled.

apiVersion: k8s.cv.io/v1

kind: CvTaskSet

metadata:

name: cvtaskset001

namespace: cv-config

spec:

appName: wordpress-mysql

appNamespace: wordpress

tasks:

- cvTaskName: cvtask001

cvTaskNamespace: cv-config

executionOrder: 1

id: custom-testcase-123

isDisabled: false

- cvTaskName: cvtask002

cvTaskNamespace: cv-config

executionOrder: 3

id: custom-testcase-123

isDisabled: false

- cvTaskName: cvtask002

cvTaskNamespace: cv-config

executionOrder: 2

id: custom-testcase-123

isDisabled: true

CvTaskSet Example 2: labelSelectors

In this example, for applications in the namespace wordpress that have labels tier=frontend and app=database, tasks cvtask001 and cvtask002 are executed in the same order. Task cvtask003 is skipped because it is disabled.

apiVersion: k8s.cv.io/v1

kind: CvTaskSet

metadata:

name: cvtaskset001

namespace: cv-config

spec:

labelSelectors:

- -tier=frontend

- -app=database

appNamespace: wordpress

tasks:

- cvTaskName: cvtask001

cvTaskNamespace: cv-config

executionOrder: 1

id: custom-testcase-123

isDisabled: false

- cvTaskName: cvtask002

cvTaskNamespace: cv-config

executionOrder:

id: custom-testcase-123

isDisabled: false

- cvTaskName: cvtask003

cvTaskNamespace: cv-config

executionOrder: 2

id: custom-testcase-123

isDisabled: true