Optimizing etcd on Slow Disks in Kubernetes

In Kubernetes, etcd is the central database that stores the entire cluster state.
If etcd runs on slow disks, you might notice performance issues: API requests slow down, pods take longer to schedule, and sometimes the cluster feels “laggy.”

Why etcd Struggles on Slow Disks

etcd is very I/O-intensive. Each write goes to disk to guarantee consistency.
On spinning HDDs or cheap cloud disks with poor IOPS, etcd can quickly become a bottleneck.

Typical symptoms:

Slow kubectl responses
Pods stuck in Pending
Increased API server latency
High disk usage in /var/lib/etcd

Running Defragmentation

etcd keeps a history of changes (MVCC). Over time, the database grows, even if old keys are deleted.
This is why etcd provides defrag, which compacts the storage.

Example:

ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  defrag

Best Practices for Slow Disks

Defragment regularly – prevents DB bloat.
Enable quotas – set –quota-backend-bytes to control etcd database size.
Move etcd to faster disks if possible (SSD/NVMe).
Monitor latency – use metrics (etcd_disk_wal_fsync_duration_seconds).
Avoid running etcd with noisy neighbors – dedicate resources.

Conclusion

Running etcd on slow disks is risky, but with proper defragmentation, quotas, and monitoring, you can keep the cluster responsive. If your cluster is critical, always prefer fast SSD storage for etcd.

Optimizing etcd on Slow Disks in Kubernetes#

Why etcd Struggles on Slow Disks#

Running Defragmentation#

Best Practices for Slow Disks#

Conclusion#

Optimizing etcd on Slow Disks in Kubernetes

Why etcd Struggles on Slow Disks

Running Defragmentation

Best Practices for Slow Disks

Conclusion