No space left on device

0

I'm experiencing a very strange "No space left on device" error when using a custom AMI for AWS Batch.

The AMI was created starting from ECS-Optimized Amazon Linux AMI 2017.03 to which was added a third EBS volume of 1000GB. The Docker storage has been extended as explained here http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-ami-storage-config.html i.e.:

sudo vgextend docker /dev/xvdb
sudo lvextend -L+1000G /dev/docker/docker-pool

However when launching a few jobs I get quite immediately the following error message:

.command.run.1: line 50: cannot create temp file for here-document: No space left on device
tee: .command.err: No space left on device
mkdir: cannot create directory ‘fastqc_SRR3192434_logs’: No space left on device

Logging in the instance it seems to be enough space:

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1      7.8G  1.2G  6.5G  16% /
devtmpfs         32G   96K   32G   1% /dev
tmpfs            32G     0   32G   0% /dev/shm

$ sudo vgs
  VG     #PV #LV #SN Attr   VSize    VFree  
  docker   2   1   0 wz--n- 1021.99g 224.00m

$ sudo lvs
  LV          VG     Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  docker-pool docker twi-aot--- 1021.73g             2.46   11.87  

$ docker info 
Containers: 8
 Running: 3
 Paused: 0
 Stopped: 5
Images: 3
Server Version: 17.03.2-ce
Storage Driver: devicemapper
 Pool Name: docker-docker--pool
 Pool Blocksize: 524.3 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: ext4
 Data file: 
 Metadata file: 
 Data Space Used: 28 GB
 Data Space Total: 1.097 TB
 Data Space Available: 1.069 TB
 Metadata Space Used: 3.031 MB
 Metadata Space Total: 25.17 MB
 Metadata Space Available: 22.13 MB
 Thin Pool Minimum Free Space: 109.7 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.51-10.52.amzn1.x86_64
Operating System: Amazon Linux AMI 2017.09
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 62.91 GiB
Name: ip-172-30-2-110
ID: VEEF:VQDY:Z72J:NY25:YIMO:BG7Z:J5EH:ZBXU:IKLX:OJN2:F7GM:EY5Q
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Any idea what's wrong ?

Edited by: paulecci on Oct 28, 2017 5:00 AM

asked 6 years ago1201 views
4 Answers
0

Could you check /etc/sysconfig/docker-storage in regards to DOCKER_STORAGE_OPTIONS ?

We've increased container size using e.g. --storage-opt dm.basesize=65G .

answered 6 years ago
0

The current content of file /etc/sysconfig/docker-storage is:

DOCKER_STORAGE_OPTIONS="--storage-driver devicemapper --storage-opt dm.thinpooldev=/dev/mapper/docker-docker--pool --storage-opt dm.use_deferred_removal=true --storage-opt dm.fs=ext4 --storage-opt dm.use_deferred_deletion=true"

Edited by: paulecci on Oct 30, 2017 5:21 AM

answered 6 years ago
0

Adding the --storage-opt dm.basesize=200GB option solve the problem. Thanks for pointing in the right direction.

Edited by: paulecci on Oct 30, 2017 1:32 PM

answered 6 years ago
0

Hello,

I am using a p3.8xlarge instance.
I could not find /etc/sysconfig/docker-storage.
Could you help me find it?

Thanks.

Edited by: Aldebaran33 on Dec 10, 2018 6:43 PM

answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions