A list for people interested in developing the OVAL language.
View all threadsHi!
Recently we, OpenSCAP and Compliance as Code, tried to facilitate some
performance optimizations for rules that traverse the whole filesystem
of the target.
In general these rules are about looking for files of certain
characteristics rather than collecting info about a group in a known
location, hence the full scan. Like, for example, a list of all
suid-enabled files.That usually means the object is described with
<filepath>/</filepath> and <behaviors recurse_file_system=local
recurse_direction=down>.
The recurse_file_system=local was designed to help in avoiding
unnecessary performance problems while commencing queries of that
kind. But it is not really clear to what extent the scanner should
take this optimization.
For example OpenSCAP understand these filesystems types as non-local:
"ceph", "cifs", "smb3", "smbfs", "sshfs", "ncpfs", "ncp", "nfs",
"nfs4", "gfs", "gfs2", "glusterfs", "gpfs", "pvfs2", "ocfs2",
"lustre", "davfs", "afs". They all are network-mounted filesystems and
there are no questions about their remoteness or performance impact.
(Although, they still can bear threats to the system, so it might be
questionable if we should blindly exclude them from the scan. But
that's another topic).
Then there are "proc" and "sysfs" filesystems, which are not really
remote, but at the same moment not really filesystems. They are
projections of internal databases (sometimes very big ones), and not
really meant to be recursively traversed. Are we eligible to exclude
them as well?
Another interesting filesystem type is the OCI (Docker/Podman)
"overlay" (/var/lib/containers/storage/overlay/), which is not really
local, but in another plane: it is mounted from virtualized
environments. And lots of containers might have a huge impact on the
performance of the host scan.
So, my question is, how exactly do we define "local" and "non-local"
filesystems?
Best regards,
Evgenii Kolesnikov
Senior Software Engineer
Security Compliance
Red Hat EMEA
https://www.redhat.com
Hi Evgeny,
This is an underspecified part of the specification, for which we’ve tried a few different approaches. Part of the problem is that the file-system type can be (more or less) arbitrary strings, and there is no way to conclusively enumerate them.
Our original approach to solving this problem was to say that whatever the filesystem type was for ‘/‘, was the “local” type. So for local recursion, we’d filter using that type. More recently, we’ve come to use the -l switch in the df command to identify the “local” filesystems. We feel this is probably a better reflection of the intention behind the local attribute value. Either of the above approaches will exclude special filesystems like /proc.
You have a good point about the overlay filesystem; I do not know offhand whether that filesystem reliably has a particular type (is it always “overlay"?). I am fairly certain, however, that the local attribute pre-dates the birth of Docker, and was intended only to prevent searches of NFS-mounted drives. We should come up with a more conclusive determination of its meaning today.
Best regards,
—David Solin
On Apr 9, 2021, at 12:05 AM, Evgeny Kolesnikov ekolesni@redhat.com wrote:
Hi!
Recently we, OpenSCAP and Compliance as Code, tried to facilitate some
performance optimizations for rules that traverse the whole filesystem
of the target.
In general these rules are about looking for files of certain
characteristics rather than collecting info about a group in a known
location, hence the full scan. Like, for example, a list of all
suid-enabled files.That usually means the object is described with
<filepath>/</filepath> and <behaviors recurse_file_system=local
recurse_direction=down>.
The recurse_file_system=local was designed to help in avoiding
unnecessary performance problems while commencing queries of that
kind. But it is not really clear to what extent the scanner should
take this optimization.
For example OpenSCAP understand these filesystems types as non-local:
"ceph", "cifs", "smb3", "smbfs", "sshfs", "ncpfs", "ncp", "nfs",
"nfs4", "gfs", "gfs2", "glusterfs", "gpfs", "pvfs2", "ocfs2",
"lustre", "davfs", "afs". They all are network-mounted filesystems and
there are no questions about their remoteness or performance impact.
(Although, they still can bear threats to the system, so it might be
questionable if we should blindly exclude them from the scan. But
that's another topic).
Then there are "proc" and "sysfs" filesystems, which are not really
remote, but at the same moment not really filesystems. They are
projections of internal databases (sometimes very big ones), and not
really meant to be recursively traversed. Are we eligible to exclude
them as well?
Another interesting filesystem type is the OCI (Docker/Podman)
"overlay" (/var/lib/containers/storage/overlay/), which is not really
local, but in another plane: it is mounted from virtualized
environments. And lots of containers might have a huge impact on the
performance of the host scan.
So, my question is, how exactly do we define "local" and "non-local"
filesystems?
Best regards,
Evgenii Kolesnikov
Senior Software Engineer
Security Compliance
Red Hat EMEA
https://www.redhat.com
OVAL_Developer mailing list -- oval_developer@lists.cisecurity.org
To unsubscribe send an email to oval_developer-leave@lists.cisecurity.org
Hey David!
Our original approach to solving this problem was to say that whatever the filesystem type was for ‘/‘, was the “local” type. So for local recursion, we’d filter using that type.
I thought that this scenario is covered by the 'defined' value. Should
it also be a part of the 'local' behaviour?
More recently, we’ve come to use the -l switch in the df command to identify the “local” filesystems. We feel this is probably a better reflection of the intention behind the local attribute value. Either of the above approaches will exclude special filesystems like /proc.
The coreutils' df tool has two types of filesystems it filters out:
dummy[1] and remote[2]. Both 'proc' and 'sysfs' are dummies, and they
are displayed by the df tool only when -a (all) option is present. So,
I guess it would be correct for OpenSCAP to also support a list of
dummy filesystems to skip in 'local' mode.
Still, I think it would be better to at least mention this approach in
the spec, despite it not being a formal list of filesystems.
You have a good point about the overlay filesystem; I do not know offhand whether that filesystem reliably has a particular type (is it always “overlay"?). I am fairly certain, however, that the local attribute pre-dates the birth of Docker, and was intended only to prevent searches of NFS-mounted drives. We should come up with a more conclusive determination of its meaning today.
Unfortunately it is a regular "overlay" type, and moreover the df tool
think that it is a local filesystem:
Filesystem 1K-blocks Used Available
Use% Mounted on
...
shm 64000 0 64000
0% /var/lib/containers/storage/overlay-containers/b992d5a015e84d8f336c1cf19dc080ea1c1ba0dc60c538c3774c253032e31bb2/userdata/shm
overlay 71711864 39340860 28685220
58% /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged
...
overlay on /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged
type overlay (rw,nodev,relatime,context="system_u:object_r:container_file_t:s0:c320,c894",lowerdir=/var/lib/containers/storage/overlay/l/QJBWQN7FISPOYVK2S5PWYANC2U,upperdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/diff,workdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/work)
No idea how to formally differentiate it from other overlays at this moment.
Best regards,
Evgenii Kolesnikov
Red Hat EMEA.
[1] https://github.com/coreutils/gnulib/blob/2a597a7ae16fb988daa473f8df2468d7188c118a/lib/mountlist.c#L164
[2] https://github.com/coreutils/gnulib/blob/2a597a7ae16fb988daa473f8df2468d7188c118a/lib/mountlist.c#L223
Hi Evgeny, see below...
On Apr 12, 2021, at 2:42 AM, Evgeny Kolesnikov ekolesni@redhat.com wrote:
Hey David!
Our original approach to solving this problem was to say that whatever the filesystem type was for ‘/‘, was the “local” type. So for local recursion, we’d filter using that type.
I thought that this scenario is covered by the 'defined' value. Should
it also be a part of the 'local' behaviour?
‘defined’ is a little different. What I described above operates on the filesystem type, and therefore, permits traversal between filesystems as long as they’re the same type. What “defined” means (from the schema documentation) follows:
"The value of 'defined' keeps any recursion within the file system that the file_object (path+filename or filepath) has specified.”
So, if you had defined set, recursion should not transition between filesystems, e.g., you wouldn’t be able to find a file in /tmp if you started from /.
More recently, we’ve come to use the -l switch in the df command to identify the “local” filesystems. We feel this is probably a better reflection of the intention behind the local attribute value. Either of the above approaches will exclude special filesystems like /proc.
The coreutils' df tool has two types of filesystems it filters out:
dummy[1] and remote[2]. Both 'proc' and 'sysfs' are dummies, and they
are displayed by the df tool only when -a (all) option is present. So,
I guess it would be correct for OpenSCAP to also support a list of
dummy filesystems to skip in 'local' mode.
Still, I think it would be better to at least mention this approach in
the spec, despite it not being a formal list of filesystems.
I quite agree!
You have a good point about the overlay filesystem; I do not know offhand whether that filesystem reliably has a particular type (is it always “overlay"?). I am fairly certain, however, that the local attribute pre-dates the birth of Docker, and was intended only to prevent searches of NFS-mounted drives. We should come up with a more conclusive determination of its meaning today.
Unfortunately it is a regular "overlay" type, and moreover the df tool
think that it is a local filesystem:
Filesystem 1K-blocks Used Available
Use% Mounted on
...
shm 64000 0 64000
0% /var/lib/containers/storage/overlay-containers/b992d5a015e84d8f336c1cf19dc080ea1c1ba0dc60c538c3774c253032e31bb2/userdata/shm
overlay 71711864 39340860 28685220
58% /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged
...
overlay on /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged
type overlay (rw,nodev,relatime,context="system_u:object_r:container_file_t:s0:c320,c894",lowerdir=/var/lib/containers/storage/overlay/l/QJBWQN7FISPOYVK2S5PWYANC2U,upperdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/diff,workdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/work)
No idea how to formally differentiate it from other overlays at this moment.
We could introduce a new recurse behavior to control traversal into overlay filesystems — but, what other overlays are there?
Best regards,
Evgenii Kolesnikov
Red Hat EMEA.
[1] https://github.com/coreutils/gnulib/blob/2a597a7ae16fb988daa473f8df2468d7188c118a/lib/mountlist.c#L164
[2] https://github.com/coreutils/gnulib/blob/2a597a7ae16fb988daa473f8df2468d7188c118a/lib/mountlist.c#L223