Troubleshooting
This guide helps diagnose and resolve common issues with the Scality CSI Driver for S3.
Quick Diagnostics
1. Check Driver Health
1 2 3 4 5 | |
2. Check S3 Connectivity
1 2 3 4 5 | |
Common Issues and Solutions
Pod Issues
| Symptom | Cause | Solution |
|---|---|---|
Pod stuck in ContainerCreating |
Mount operation failed | 1. Check driver logs 2. Check S3 credentials 3. Check mount options 4. Ensure unique volumeHandle |
Pod stuck in Terminating |
Mount point busy or corrupted | 1. Force delete pod: kubectl delete pod <name> --force2. Check for subPath issues (see below) |
| Pod fails with "Permission denied" | Missing mount permissions | Add allow-other to PV mountOptions |
| Pod cannot write/delete files | Missing write permissions | Add allow-delete and/or allow-overwrite to PV mountOptions |
MountVolume.SetUp failed: context deadline exceeded with mounter pod log showing accept unix /comm/mount.sock: i/o timeout |
Mounter pod missing FSGroup in security context | Upgrade to the latest release. As a workaround, remove fsGroup from workload pod's security context |
Pod stuck in ContainerCreating with "driver name s3.csi.scality.com not found in the list of registered CSI drivers" |
CSI driver not yet registered (startup race condition) | Apply s3.csi.scality.com/agent-not-ready:NoExecute taint to nodes. See Node Startup Taint |
Pod stuck in ContainerCreating with configmap "..." not found event |
CA certificate ConfigMap missing from the pod's namespace | Create the ConfigMap in the correct namespace. See TLS Troubleshooting |
Mount Issues
| Error Message | Cause | Solution |
|---|---|---|
| "Transport endpoint not connected" | S3 endpoint unreachable | 1. Check network connectivity 2. Check endpoint URL configuration 3. Check security groups/firewall rules |
| "Failed to create mount process" | Mountpoint binary issue | 1. Check initContainer logs 2. Check /opt/mountpoint-s3-csi/bin/mount-s3 exists on node |
| "Access Denied" | Invalid S3 credentials | 1. Check secret contains access_key_id and secret_access_key2. Test credentials with AWS CLI 3. Check bucket policy |
| "InvalidBucketName" | Bucket name issue | 1. Check bucket exists 2. Check bucket name format 3. Ensure no typos |
| "AWS_ENDPOINT_URL environment variable must be set" | Missing endpoint configuration | Set s3EndpointUrl in Helm values or driver configuration |
| TLS handshake failure or certificate verify failed | CA certificate ConfigMap missing or incorrect | Check the CA ConfigMap exists in both the controller namespace (default: kube-system) and the mounter pod namespace (mountpointPod.namespace, default: mount-s3) with key ca-bundle.crt. See TLS Configuration |
Volume Issues
| Issue | Description | Solution |
|---|---|---|
| Multiple volumes fail in same pod | Duplicate volumeHandle |
Ensure each PV has unique volumeHandle value |
subPath returns "No such file or directory" |
Empty directory removed by Mountpoint | Use prefix mount option instead of subPath (see below) |
| Volume not mounting | Misconfigured PV/PVC | Check storageClassName: "" for static provisioning |
Known Limitations and Workarounds
SubPath Behavior
When using subPath with S3 volumes, deleting all files in the directory causes the directory itself to disappear, making the mount unusable.
Instead of:
1 2 3 4 | |
Use prefix mount option:
1 2 3 | |
Multiple Volumes in Same Pod
Each PersistentVolume must have a unique volumeHandle:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | |
Uninstallation Issues
Namespace Stuck Terminating
1 2 3 4 5 6 7 | |
PersistentVolumes Stuck Terminating
1 2 3 4 5 | |
Orphaned Helm Release
1 2 3 4 5 6 | |
Debug Mode
Enable debug logging for detailed diagnostics:
1 2 3 4 5 | |
View debug logs:
1 2 | |
Performance Troubleshooting
| Symptom | Possible Cause | Action |
|---|---|---|
| Slow file operations | High S3 latency | 1. Check network latency to S3 2. Enable caching with cache mount option3. Consider using closer S3 region |
| High memory usage | Large cache size | Limit cache with max-cache-size mount option |
| Slow directory listings | No metadata caching | Add metadata-ttl mount option (e.g., metadata-ttl=60) |
Getting Help
If issues persist after following this guide:
-
Collect diagnostic information:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
# CSI driver logs (all containers) kubectl logs -n kube-system -l app.kubernetes.io/name=scality-mountpoint-s3-csi-driver --all-containers=true > csi-driver-logs.txt # Node plugin logs specifically kubectl logs -n kube-system -l app.kubernetes.io/name=scality-mountpoint-s3-csi-driver -c s3-plugin > node-plugin-logs.txt # CSI node driver registrar logs kubectl logs -n kube-system -l app.kubernetes.io/name=scality-mountpoint-s3-csi-driver -c node-driver-registrar > registrar-logs.txt # Your pod description and events kubectl describe pod <your-pod> > pod-description.txt # PV and PVC details kubectl describe pv <your-pv> > pv-description.txt kubectl describe pvc <your-pvc> > pvc-description.txt # S3 bucket configuration (if accessible) aws s3api get-bucket-location --bucket <bucket-name> --endpoint-url <endpoint> > bucket-location.txt aws s3api get-bucket-versioning --bucket <bucket-name> --endpoint-url <endpoint> > bucket-versioning.txt aws s3api get-bucket-policy --bucket <bucket-name> --endpoint-url <endpoint> > bucket-policy.txt 2>&1 aws s3api list-objects-v2 --bucket <bucket-name> --max-items 10 --endpoint-url <endpoint> > bucket-list-sample.txt -
Contact Scality Support with, Driver version, Kubernetes version, Error messages and all collected information in Step 1, PV/PVC/Pod YAML manifests (sanitized)