-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Problem
The following scenario:
- Start an LXC instance and wait until running.
- Reboot executor machine.
Now the LXC container is stopped but OpenVDC thinks it's running.
- Start the instance
Result:
Feb 20 14:38:24 ci openvdc-scheduler[2806]: 2017-02-20 14:38:24 [FATAL] github.com/axsh/openvdc/api/instance_service.go:86 BUGON: Detected un-handled state instance_id=i-0000000000 state=state:RUNNING created_at:<seconds:1487314564 nanos:237858284 >
Feb 20 14:38:24 ci systemd[1]: openvdc-scheduler.service: main process exited, code=exited, status=1/FAILURE
Feb 20 14:38:24 ci systemd[1]: Unit openvdc-scheduler.service entered failed state.
Feb 20 14:38:24 ci systemd[1]: openvdc-scheduler.service failed.
The openvdc-scheduler
service dies.
# systemctl status openvdc-scheduler
● openvdc-scheduler.service - OpenVDC scheduler
Loaded: loaded (/usr/lib/systemd/system/openvdc-scheduler.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Mon 2017-02-20 14:38:24 JST; 6min ago
Process: 2806 ExecStart=/opt/axsh/openvdc/bin/openvdc-scheduler (code=exited, status=1/FAILURE)
Main PID: 2806 (code=exited, status=1/FAILURE)
Suggested solution
-
On executor start, OpenVDC should check that all instances are in their expected state. If they are not, they should be brought to the states OpenVDC expects them to be.
-
When start is called on a container OpenVDC thinks is "RUNNING", first check which state the instance is actually in. Then switch it to the correct state and run the
start
command on that. -
Make sure that scheduler never dies no matter what state
start
is called on.
Other suggestions welcome. ^_^