- 
                Notifications
    You must be signed in to change notification settings 
- Fork 394
Cortexutils extractor ip detection #199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Cortexutils extractor ip detection #199
Conversation
| As the extractor should extract observables that aren't available as "a single line" also, line start and end markers ( I need to think about how (if?) it's possible to distinguish between an IP in CIDR notation which is in-line and the mentioned "domain case". | 
| Even with the fix,  Why do you have to use regular expressions to match IP addresses? Why not use something like ip_address from the standard library? def is_ip(s):
    try:
        ip_address(s)
        return True
    except ValueError:
        return FalseIn the end, there is not different types for IPv4 and IPv6 in TheHive, only  Also, maybe it would be worth considering adding a  | 
| 
 Because the extractor should provide an easy way to retrieve observables from reports even if they are in-line and not explicit given. Analyzers do not have to use the extractor and can implement an own  
 That would indeed be possible - after finding a possible IP address using regex. | 
| 
 
 I'm sorry, I don't understand why this makes using regular expressions a hard requirement. 
 
 Again, why?  (*) The changes would be minimal and only internal to the class. EDIT: to be clear, I volunteer to implement such changes in a PR if you deem them worth. | 
| 
 Yes, it takes a string, but cannot find addresses in a block of text. Again, the extractor should provide the functionality to "automagically" find observables/IoCs in strings which are not the observable itself, but a "wall of text". Of course is recognizing single ip strings through regex is not the best way to achieve that. But you cannot guarantee that. As this affects only analyzers run through MISP, it has a low prio for me until cortex 2 is released and the documentation is polished up as you're able to easily delete inappropriate IoCs (what has to be done anyway, as not every returned value is appropriate) in the overview after running the analyzer. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently not applicable this way as observables cannot be found in free-text. Without the start and end markers it changes basically nothing. Would like to postpone this until Cortex 2 is released and documentation work is done. But thanks for the contribution.
| I'm running into an issue where the IP's extracted (ipv4) include version strings and are not valid. I fixed this in code that calls cortex (external App) and basically filters the artifacts through the a call to IP Address. You could post process after the regex results are returned with ipaddress(val).is_global to limit out some of the noise. | 
fb8f5aa    to
    23be632      
    Compare
  
    
Fix proposal for #198.