ECS 3 CE single node services unavailable / root file system full

I have seen mention of out of space issues as well as unresponsive web UI and services. I exec'd into the container to find that root was full.

ecs0:/tmp # ls -lsa

total 52

0 drwxrwxrwt 5 root      root        192 Mar 20 19:45 .

4 drwxr-xr-x 28 root      root       4096 Feb 16 19:14 ..

0 -rw-r--r-- 1 storageos storageos     0 Mar 20 19:17 FPVMhealthcheck.heartbeat

0 -rw-r--r-- 1 root      root          0 Mar 20 19:17 certtool.lock

0 drwxr-xr-x 2 root      root         19 Mar 20 19:52 hsperfdata_root

0 drwxr-xr-x 2 storageos storageos    32 Mar 20 19:52 hsperfdata_storageos

0 drwx------ 2 root      root         94 Feb 16 19:30 run-crons.6yRf5Z

48 -rwxr-xr-x 1 storageos storageos 48432 Feb 16 19:55 snappy-1.0.5-libsnappyjava.so

0 -rw-r--r-- 1 root      root          0 Mar 20 19:17 systool.lock

ecs0:/tmp # cd run-crons.6yRf5Z/

ecs0:/tmp/run-crons.6yRf5Z # ls -la

total 8898796

drwx------ 2 root root         94 Feb 16 19:30 .

drwxrwxrwt 5 root root        192 Mar 20 19:45 ..

-rw-r--r-- 1 root root 9112358912 Mar 20 19:17 run-crons.hourly.16288

-rw-r--r-- 1 root root         32 Feb 16 19:30 run-crons_mail.16288

-rw-r--r-- 1 root root          0 Feb 16 19:30 run-crons_output.16288

ecs0:/tmp/run-crons.6yRf5Z # file run-crons.hourly.16288

run-crons.hourly.16288: ASCII text

ecs0:/tmp/run-crons.6yRf5Z # head run-crons.hourly.16288

timeout: invalid time interval 'find'

Try 'timeout --help' for more information.

timeout: invalid time interval 'find'

Try 'timeout --help' for more information.

timeout: invalid time interval 'find'

Try 'timeout --help' for more information.

timeout: invalid time interval 'find'

Try 'timeout --help' for more information.

timeout: invalid time interval 'find'

Try 'timeout --help' for more information.

ecs0:/tmp/run-crons.6yRf5Z # tail run-crons.hourly.16288

timeout: invalid time interval 'find'

Try 'timeout --help' for more information.

timeout: invalid time interval 'find'

Try 'timeout --help' for more information.

timeout: invalid time interval 'find'

Try 'timeout --help' for more information.

timeout: invalid time interval 'find'

Try 'timeout --help' for more information.

timeout: invalid time interval 'find'

Try 'timecs0:/tmp/run-crons.6yRf5Z # rm run-crons.hourly.16288

Right or wrong, I removed the file vs truncating it, but I question whether there is a bug to address, expected behavior, or a combination thereof. After exiting the container, I issued a docker restart on it, and all the services began to function normally.

Thoughts?

Thanks!! Mike

Responses(4)

JasonCwik

281 Posts

0

March 21st, 2017 08:00

That's an interesting one. How long was your system up? What base OS are you installed on?

Mike_Phelps

1 Rookie

•

12 Posts

0

March 21st, 2017 12:00

CentOS 7.3.1611 created on 2017 Feb 16

Docker Engine 1.13.1

emccorp/ecs-software-3.0.0:latest image id e3022d56bf25

4 vCPU, 32GB RAM, 80GB OS

4 1TB LUNs for ECS

JasonCwik

281 Posts

1

April 11th, 2017 10:00

Chris, actually the opposite. The setting does exist in real ECS (7200) and our patch inadvertently removed it.

travis_wichert

16 Posts

0

July 21st, 2017 12:00

Yes, that time in real ECS is two hours (7200s).

Also, this should issue should be resolved in ECS CE since we are no longer overwriting vnest.object.properties when building the CE image from the upstream ECS release artifact.

View All

No Events found!

ECS

ECS 3 CE single node services unavailable / root file system full

Was this post helpful?