Skip to content

Conversation

JonathanAnbary
Copy link
Contributor

Implemented Process scanning in lib (not accessible from cli yet) for linux and windows.
the semantics of the scan are as follows:

  • string are searched in each memory region of the scanned process separately, and results are combined.
  • modules are not ran, and conditions that depend on offsets evaluate to undefined (behaving as though the length of the scanned data is 0).

@JonathanAnbary JonathanAnbary changed the title Process scanning feat: process scanning Jun 24, 2025
@JonathanAnbary
Copy link
Contributor Author

I fixed the clippy errors (sorry about that).
In regards to the macos issues I also think I fixed it (by that I mean that it should compile not that I implemented process scanning) but I dont have a way to test it (Im working from linux, and cross compiling to macos seems like a pain) unless some one has a decent flake.nix/shell.nix for cargo with cross compilation to darwin.

@JonathanAnbary
Copy link
Contributor Author

Ok, pretty sure I actually fixed the clippy errors now (switched to using the toolchain thats listed at the top of the clippy job).
in regards to the macos, I also fixed the error that was reported in the job but I dont actually have a way to test it.
Ill be trying to setup cross compilation for macos from my linux machine, and if that works Ill at least be able to ensure that it compiles.

Copy link

@secDre4mer secDre4mer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for implementing this (as this was the main thing I was missing from yara-x so far).

Regarding MacOS: If you need something tested on a MacOS, I'd be glad to help out.

.set(self.wasm_store.as_context_mut(), Val::I32(1))
.unwrap();

// Set the global variable `filesize` to the size of the total memroy regions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly this should be something the caller can choose? YARA uses "memory iterators", which have a similar option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds good, the only thing Im not sure about is how to set a wasm global to undefined.
adding a method to the DataIter trait is probably the way to do this.

@JonathanAnbary
Copy link
Contributor Author

Regarding MacOS: If you need something tested on a MacOS, I'd be glad to help out.

If you could try compiling the branch for macos and inform me if any issues that arise that would be great (macos is not implemented so dont expect to be able to scan processes, I just want to make sure that it compiles).

@secDre4mer
Copy link

It compiles on MacOS. There are some warnings about unused code (warning: method scan_many_impl is never used and similar), but that is expected since without process scanning, these are superfluous for MacOS right now.

@JonathanAnbary
Copy link
Contributor Author

@plusvic would it be possible to run the checks now?
I'm pretty sure it should pass now.

@JonathanAnbary
Copy link
Contributor Author

I am happy to report that my hat is delicious.
I fixed the test, and then compiled and tested with windows-msvc (previously was using windows-gnu).
Its a bit of a weird test and it honestly might be better to just remove it but I wanted to make sure that the match content was not being duplicated for overlapping matches and this test (scan_proc_overlapping_matches) was the only way I could think of doing that.
Ill be happy if you can rerun the tests, Hopefully for the last time...

… processes memory into our own memory at once
…ux (although of course, scan_proc wont be available)
@JonathanAnbary
Copy link
Contributor Author

Hi @plusvic.
I'd love to know what needs to be done in order to get this merged as soon as possible :)

@plusvic
Copy link
Member

plusvic commented Jul 7, 2025

Hi @JonathanAnbary, it will take me some time before I can look into this thoroughly. As this is a big feature with a lot of implications from the API standpoint, I want to be very careful with it. My preliminary assessment is that this is unlikely to be merged as is. I would like to avoid at all cost to have all the OS-dependent code merged into the yara-x crate. I will probably think about a generic API that allows implementing process scanning on top of it, but leaving the memory reading stuff to the user of the API.

@gustavo-iniguez-goya
Copy link

Aupa @plusvic !

If you eventually add this feature, consider parsing /proc/<pid>/exe and other files (stat, status, ...), in order to scan for red flags on Linux, like processes masquerading as kernel threads, or processes executed from locations like /dev/shm, /tmp, /memfd, /var/tmp, etc.

@plusvic
Copy link
Member

plusvic commented Sep 26, 2025

After merging #459 we already have an API for scanning memory blocks that could support process memory scanning on top of it. For the time being I don't plan to implement the OS-dependent memory reading logic. If that's ever implemented, I envision it as a separate crate that abstracts you from the details of reading the memory of another process given the process ID. I guess some other Rust crates can benefit from having such an API in a stand-alone crate.

With the new block scanning API, and some other crate that reads memory from external processes, implementing process scanning should be fairly trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants