-
Notifications
You must be signed in to change notification settings - Fork 28
Description
Hi
getprocs uses an IndexPointer under AIX, which doesn't need to correlate in every case to the PID.
The usage is correct in FetchPids, where we start with 0 and iterate through the process table:
elastic-agent-system-metrics/metric/system/process/process_aix.go
Lines 46 to 52 in 406673b
| pid := C.pid_t(0) | |
| procMap := make(ProcsMap, 0) | |
| var plist []ProcState | |
| for { | |
| // getprocs first argument is a void* | |
| num, err := C.getprocs(unsafe.Pointer(&info), C.sizeof_struct_procsinfo64, nil, 0, &pid, 1) |
But in GetInfoForPid we are using the IndexPointer as the pid:
elastic-agent-system-metrics/metric/system/process/process_aix.go
Lines 66 to 70 in 406673b
| func GetInfoForPid(_ resolve.Resolver, pid int) (ProcState, error) { | |
| info := C.struct_procsinfo64{} | |
| cpid := C.pid_t(pid) | |
| num, err := C.getprocs(unsafe.Pointer(&info), C.sizeof_struct_procsinfo64, nil, 0, &cpid, 1) |
And in FillPidMetrics the same:
elastic-agent-system-metrics/metric/system/process/process_aix.go
Lines 113 to 118 in 406673b
| func FillPidMetrics(_ resolve.Resolver, pid int, state ProcState, filter func(string) bool) (ProcState, error) { | |
| pagesize := uint64(os.Getpagesize()) | |
| info := C.struct_procsinfo64{} | |
| cpid := C.pid_t(pid) | |
| num, err := C.getprocs(unsafe.Pointer(&info), C.sizeof_struct_procsinfo64, nil, 0, &cpid, 1) |
Now the problem occurs. Since we have the count parameter set to 1 in GetInfoForPid and FillPidMetrics, we have a matching IndexPointer==pi_pid at least in all of my tests with:
#include <procinfo.h>
#include <sys/types.h>
#define MAXPROCS 1
int main() {
int index = 0;
struct procsinfo64 pinfo[MAXPROCS];
for (;;) {
int curindex = index;
int numproc = getprocs(pinfo, sizeof(struct procsinfo64), NULL, 0, &index, MAXPROCS);
if (numproc < 1) {
break;
}
int i;
for (i = 0;i < numproc; i++) {
printf("%-6d->%-6d %-16s\n", curindex, pinfo[i].pi_pid, pinfo[i].pi_comm);
}
}
return 0;
}
/*
Output:
0 ->0
1 ->1 init
260 ->260 wait
65798 ->65798 sched
131336->131336 lrud
196874->196874 vmptacrt
262412->262412 psmd
327950->327950 vmmd
393488->393488 pvlist
459026->459026 reaffin
524564->524564 memgrdd
590112->590112 lvmbb
...
*/Now I'm not sure if IndexPointer==pi_pid can be guaranteed for count=1 and in my opinion, we should assert the requested pid matches the info.pi_pid in the CGo functions.
But also if the process/pid was stopped in the mean time, usually the resulting ProcessBuffer has info.pi_state=Running set, which is wrong.
Therefore: Do we need to ensure that the PID exists before calling of especially GetInfoForPid or should the GetInfoForPid be modified to check first if the process exists by iterating through the getprocs index pointer? I think the second option behaves more similar to the Linux version.
I'm asking this because I'm implementing AIX support for process.GetPIDState.