cback configuration

cback is heavily configurable, a pro of this is that cback can be made to suit a vast array of different use cases, the con however, is that initial comprehension of how to configure cback can be tedious. cback utilises a few unique concepts that must be understood before attempting to apply a non default configuration.

Groups

Groups in cback are an abstract concept that allow for a specific configuration to be applied during agent runtime to jobs that are tagged as a member of said group (nominally done when creating an initial backup job). Groups are intended (if used at all) as a method of allowing a cback operator to tailor backup procedure, bellow are a few examples of how groups could be used to delimit configuration policy:

Source type: ceph, eos, nfs 
Retention period: short, medium, long
Individual-services: webservers, user-directories, images, videos, git-repos  
Departments: IT, Finance, Beams, Power-systems, HR,

as each of these groups are user defined (along with the configuration they apply) the scope of each can be made to suit whatever underlying concept best suits said operator.

Creating a group

[warning❗] The creation of a group definition in cback should occur after you finish writing your cback configuration, otherwise cback agents will look for a configuration to apply that simply isn't there.

You can create a group as shown in the bellow example:

# check for any existing groups

$ cback group ls
╒══════╤════════╕
│   Id │ Name   │
╞══════╪════════╡
│    1 │ manila │
╘══════╧════════╛

# add your new group(s)

$ cback group add example
$ cback group ls
╒══════╤═════════╕
│   Id │ Name    │
╞══════╪═════════╡
│    2 │ example │
├──────┼─────────┤
│    1 │ manila  │
╘══════╧═════════╛

Portal Scopes

The cback portal allows for a 'permission scope' to be applied to LDAP users that controls the actions said user can take when interacting with a cback system via the portal. This section can be disregarded if you do not intend to use the cback portal. A cback scope consists of the following elements:

username: the username of the specific ldap user the scope will be applied on
group:    the string value of a group name, can be substituted with "*" for ALL groups.
id:       the integer value of a specific backup job id, can be substituted with "*" for ALL backups.
Permissions: A permission list, that delimits what actions the specific user can take
    - create: allows the user to create a new backup job 
    - delete: allows the user to delete jobs from cback
    - restore: allows the user to perform cback restores
    - run: allows the manual triggering of a job run outside cbacks scheduling

As the cback portal is primarily aimed at API clients, or service accounts for a given category of backup rather then end users, these scope tuples are purposely quite coarse. Creating a new scope is relatively simple, a user can be limited for example to:

have full power on all backups within a specific group called example:

$ cback portal scope new -u example -s \
  '[{"group": "example", "id": "*", "permissions":["restore", "create", "run", "delete"]}]

have limited power to run only a specific backup

$ cback portal scope new -u example -s \
  '[{"group": "example", "id": "1", "permissions":["run"]}]

have global power to delete (but nothing else) all backups in all groups

$ cback portal scope new -u example -s \
  '[{"group": "*", "id": "*", "permissions":["delete"]}]

Creating a configuration file

Configuration of cback can seem a bit of a daunting task initially due to the sheer number of parameters available, some things to keep in mind are that:

cback agents look for a configuration file at the default path /etc/cback/config.yaml.
Agents will pull the elements that are relevant to their own operation from the configuration.
As a result of the above configuration is expected to be global, all worker nodes of a cback system should have a identical configuration file.
Most parameters in cback have default values, you can use ack <parameter_name> to see where a parameter is defined, used and whether it has a default value in the source code.

Bellow is an example configuration file, with comments for each parameter regarding purpose, it is not exhaustive but provides a good starting point regarding how cback can be configured.

---
log:  ## Logging configuration
  level: "debug" # log level of cback
  output: "/var/log/cback/cback.log" # output location of agent logs on the worker node

database:  ## database configuration
  host: "example-db.cern.ch" # hostname of the worker node running the cback database
  user: "example-db-admin" # username of the db user to interact as
  port: 5505 # integer port of the database service
  password: "example-pass" # password for the database user
  database: "cbackboxdb" # instance name of the specific internal database to use for cback

metrics: ## metric configuration (can be dropped if not used.)
  driver: "example" # driver to use
  drivers:             
    example: ## in this way, a user can define multiple different drivers. 
      host: "graphite-host.example.com"
      port: 2003
      timeout: 30

agents: ## Agent configuration
  all:  ## global configuration, any parameters that span groups and agents should go here.
    wait_time: 10 # how long to wait between polling for a new job to attempt

  example_group: ## the beginning of a group configuration section, **NOTE** multiple are allowed.
    shared:      ## configuration common to all agents WITHIN the named group
      retry_if_error: True # whether to retry a job on job failure status
      max_retries: 3  # how many times to reattempt a given job on failure
      timeout: 259000 # how long to wait (in seconds) before considering a job failed due to timeout
      s3_connections: 32 # number of parallel connections to S3 allowed per agent
      destination_env:
        AWS_ACCESS_KEY_ID: "example" # S3 access key for the target backup endpoint
        AWS_SECRET_ACCESS_KEY: "example" # S3 secret key for the target backup endpoint
      destination_config:
        s3_endpoint: "s3:https://s3-endpoint.example.com" # S3 target backup endpoint 
      bucket_prefix: "eoscanary" # a prefix to apply to all buckets within this group
      source_type: "ceph" # the source file system expected for this type of backup
      destination_type: "ceph" # the destination file system for restores for this type of backup
      password: "AAAAAAAAA" # the restic password to use for protecting all backups in this group
      cache_dir: "/var/tmp/cback" # the (per worker node) local directory to use for agent cback caching
    backup:  ## backup agent/s specific config
      enabled: True # should the backup agent/s be enabled for this group?
      force: False # should cback force a backup, even if no file changes have occurred since the last?
      files_error_log: "/var/log/cback/cback-backup-errors.log" # where to log errors on backing up specific files
      exclude_list: ["*.sys.a#*", "*.sys.v#*"] # exclusion wildcards, following restic --exclude syntax
      method: "random" # run method, where random == a random agent, or if supplied by name a specific worker node
    prune: ##prune agent/s specific config
      enabled: True # should the prune agent/s be enabled for this group? 
      method: "random" # run method, where random == a random agent, or if supplied by name a specific worker node
      retention_policy: ## definition for the retention policy of restic snapshots, controls how snapshots are pruned
        keep-daily: 7
        keep-weekly: 5
        keep-monthly: 6
      graceful_deletion: False # should prune agents use graceful deletion (overrides retention period) snapshot have to be manually deleted.
      delete_graceful_period: 20000 # if used, defines a period (in seconds) to wait before deleting a snapshot.
    verify: ## verify agent/s specific config
      enabled: True
      destination_path: "/var/tmp/cback" # base directory on the worker node for where the verify take place
    restore: # restore agent/s specific config
      enabled: True
    switch: ## switch agent/s specific config
      enabled: True
      expiration_times: 
        backup: 86400 # the time (in seconds) before a backup job should be repeated for a backup
        prune: 86400  # the time (in seconds) before a prune job should be repeated for a backup
        verify: 2592000 # the time (in seconds) before a verify job should be repeated for a backup

if you are intending to utilize the cback portal, further configuration must be added:

portal: ## cback portal specific config
  auth: ## cback portal authorization config
    providers: ## authentication providers
      ldap:    ## configuration for ldap querying, requires a service account for lookups and impersonation
        hostname: "ldap.example.com"  # ldap service hostname
        port: 389 # port to use for connecting to ldap
        bind_username: "example" # LDAP3 schema username
        bind_password: "example" # user password
      impersonator: ## cback user impersonator configuration, mirrors database configuration nominally.
        host: "example-db.cern.ch" # hostname of the worker node running the cback database
        user: "example-db-admin" # hostname of the db user to interact as
        port: 5505 # integer port of the database service
        password: "example-pass" # password for the database user
        database: "cbackboxdb" # instance name of the specific internal database to use for cback
    mapping: ## user mapping method supports these
      basic: ["impersonator", "ldap"]
    jwt:
      secret_key: # json web token secret key, used to sign client tokens, generate with 'openssl rand -hex 32'
      expire: 60 # expiry time for a given user JWT token in minutes
    cache: # REDIS cache cli details
      host: "localhost" # can be left as localhost
      port: 6379 # can be left as 6379
      password: "" # can be left empty unless securing
    scope: ## scope specific configuration
      rules: ## scope rules to apply on a per group basis
        example: ["manual"] # example rule definition, apply the manual driver scope method to the group "example"
      drivers:
        manual: ## manual scope driver configuration
          host: "example-db.cern.ch" # hostname of the worker node running the cback database
          user: "example-db-admin" # hostname of the db user to interact as
          port: 5505 # integer port of the database service
          password: "example-pass" password for the database user
          database: "cbackboxdb" instance name of the specific internal database to use for cback
  mount:
    nodes: ["https://cback-portal-example:9000"] # a list of all cback portal nodes that can mount
    api_token: "change-me" # http api-token used for checking against when performing a mount via the portal 
  user: ## user scope lookup section
    enable_cache: true # allow the REDIS cache to cache portal user scopes lookups
    cache: # ditto of cache configuration for portal user scopes
      host: "localhost" # ditto the above cache section 
      port: 6379 # ditto the above cache section
      password: "" # ditto the above cache section
    driver: "rest" # the driver to use for user lookup, leave on rest.
    drivers: ## driver configuration
      rest:
        oidc_token_endpoint: "https://oidc-endpoint.com/auth/realms/cern/api-access/token # OIDC token endpoint (API authoriser url)
        target_api: "example-authorization-service-api" #  target API service name (API name)
        api_base_url: "https://example-service-api.com" # target API service base url (API url)
        client_id: "" # username for OIDC token endpoint (API authoriser username)
        client_secret: "" # password for OIDC token endpoint (API authoriser secret)