The /proc/net files are not guaranteed to be consistent, they are only
consitent on the row level. This is probably one of the reasons why
consequent read calls might return duplicate entries - the kernel is
changing the file as it is being read. In certain situations this might
lead to loop like situations - the same net entry is being returned when
reading the file as new connections are added to the kernel tcp table, i.e
there can be a lot of duplications.
This commit is trying to reduce the duplications, by fetching the contents
of the net files with a single read syscall.
Instead of encoding a JSON string of each connection (non-trivial at high
connection volumes) we can use the connTmp struct for map look-ups if we
eliminate the unused `uids` field.
Also switches to using the empty struct instead of bool for zero memory
overhead.
The goal is to improve performance of connection fetching connections across
all processes when some processes can have several hundred or thousands of file
descriptors. Right now when you have many thousands of fds the process spends
lots of time inside the syscalls from Readdir and Readlink.
The public API works as before with two new functions:
- `ConnectionsMax`
- `ConnectionsPidMax`
Each function takes an additional int argument that sets the max number of fds
read per process.
Package common wasn't used for public functions. Place it in an
internal directory to prevent other packages from using.
Remove the distributed references to "HOST_PROC" and "HOST_SYS"
consts and combine into a common function. This also helps so that
if a env var is defined with a trailing slash all will continue to
work as expected.
Fixes #100
Added the ability to fetch an alternative location for /proc via an
environment variable. If the env var is not set it will return /proc as
the default value.