-
Notifications
You must be signed in to change notification settings - Fork 497
Description
My Setup
I'm using physical machines managed by a FOG server to re-image VMs after each malware analysis. After an analysis completes, CAPE calls the stop()
method in physical.py to reset the machine. This method checks the VM state; if it's running, it triggers a deployment task via the FOG server to restore the VM to a clean snapshot.
Here’s the relevant part of the code:
def stop(self, label):
"""Stop a physical machine.
@param label: physical machine name.
@raise CuckooMachineError: if unable to stop.
"""
taskID_Deploy = 0
hostID = 0
## IF AGENT IS CRASHED, THIS CONDITION WOULDN'T BE TRIGGERED
## THE VM WOULDN'T BE RE-IMAGGED
if self._status(label) == self.RUNNING:
log.debug("Rebooting machine: %s", label)
machine = self._get_machine(label)
r_hosts = requests.get(f"http://{self.options.fog.hostname}/fog/host", headers=headers)
hosts = r_hosts.json()["hosts"]
for host in hosts:
if machine.name == host["name"]:
print(f"{host['id']}: {host['name']}")
hostID = host["id"]
r_types = requests.get(f"http://{self.options.fog.hostname}/fog/tasktype", headers=headers)
types = r_types.json()
Current Behavior
When the agent inside the VM crashes, self._status(label)
does not return RUNNING. As a result, the VM is skipped and never re-imaged, leaving it in an infected state indefinitely.
# IF THE AGENT CRASHES, THIS CONDITION IS NEVER TRIGGERED,
# AND THE VM WILL NOT BE RE-IMAGED
if self._status(label) == self.RUNNING:
Fix Attempt 1
To work around this, I modified the condition to check if the machine object is returned by self._get_machine(label)
instead of relying on self._status(label)
:
machine = self._get_machine(label)
# if self._status(label) == self.RUNNING:
if machine:
log.debug("Rebooting machine: %s", label)
# machine = self._get_machine(label)
New Problem Introduced
While this workaround successfully initiates re-imaging even when the agent crashes, it appears to cause another issue: machines with agent crashes are no longer used in subsequent analyses. I suspect this is because they are marked as inactive or removed from the machines pool in the SQLAlchemy-backed database.