Skip to content

Conversation

@bennotieno
Copy link
Contributor

@bennotieno bennotieno commented Jan 26, 2026

  • Adds qdrant_backup_recovery.sh for qdrant backup and restore and its documentation.
  • Adds a github workflow that updates qdrant-backup-restore/k8s/configmap-script.yaml when qdrant-backup-restore/qdrant_backup_recovery.sh changes.

@bennotieno bennotieno changed the title add scripts for qdrant backup and restore feat: add scripts for qdrant backup and restore Jan 26, 2026
@github-actions github-actions bot added the feat label Jan 26, 2026
@bennotieno bennotieno force-pushed the PI-1306 branch 7 times, most recently from 938ad63 to 8d1fe9f Compare January 28, 2026 16:45
failedJobsHistoryLimit: 1
jobTemplate:
spec:
backoffLimit: 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why so low? default is 6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if we know why it failed before re-trying it could be a non-issue at the same time be a critical issue.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always assume a non proficient operator at customer side

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, i can increase to 3 for know since these values will be overridden incase of a deployment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


# Job history limits
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

@bennotieno bennotieno Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have used these configuration to avoid saturation of (completed/failed) jobs in the cluster. At the moment i think the one previous run is what is important incase of debugging or manually tracking progress.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to bump failures and rertries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

secretKeyRef:
name: qdrant-credentials-minio # change this to your qdrant kubernetes secret name
key: QDRANT_S3_ACCESS_KEY_ID
- name: QDRANT_S3_SECRET_ACCESS_KEY
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use AWS_* vars then you can use default toolchains/sdk will work out of the box

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason i used custom vars is to avoid instances where the script ends up using or forcing update their user's current configuration. For example they could already have a un-related AWS_* vars that they dont want to change but be able to run the script.

Isolation was the main reason here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but these are only mounted to this job?

Copy link
Contributor Author

@bennotieno bennotieno Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script should also be ran directly on a machine(workstation/personal).

name: qdrant-credentials
type: Opaque
stringData:
QDRANT_S3_ACCESS_KEY_ID: your-s3-access-key-id
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important steps to note:

- **Create** S3 bucket.
- Here you can use object storage service providers like StackIt, AWS e.t.c or a self hosted solution like [minio](https://www.min.io/).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we still use minio? I think there was a license change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically yeah, it still has community edition but after the license change releases will be missing pre-built binaries and management console but it would be better to switch to another tool like https://garagehq.deuxfleurs.fr/.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, sure maybe just add it to the docs since at the moment the client already has an S3 storage setup.

value: "s3"
````

(Re)Deploy the Qdrant cluster! This should trigger a rolling update on the qdrant nodes.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might need to go nitty gritty here

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or link to official docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, will do!

- name: qdrant-backup-restore-script
configMap:
name: qdrant-backup-restore-script
defaultMode: 0755
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is sensible mode for a multi user target, in containers you can be a bit stricter (f.e. 700) but keep it like this for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants