oval_developer@lists.cisecurity.org

A list for people interested in developing the OVAL language.

View all threads

unix-def:FileBehaviors and recurse_file_system=local

EK
Evgeny Kolesnikov
Fri, Apr 9, 2021 5:05 AM

Hi!

Recently we, OpenSCAP and Compliance as Code, tried to facilitate some
performance optimizations for rules that traverse the whole filesystem
of the target.

In general these rules are about looking for files of certain
characteristics rather than collecting info about a group in a known
location, hence the full scan. Like, for example, a list of all
suid-enabled files.That usually means the object is described with
<filepath>/</filepath> and <behaviors recurse_file_system=local recurse_direction=down>.

The recurse_file_system=local was designed to help in avoiding
unnecessary performance problems while commencing queries of that
kind. But it is not really clear to what extent the scanner should
take this optimization.

For example OpenSCAP understand these filesystems types as non-local:
"ceph", "cifs", "smb3", "smbfs", "sshfs", "ncpfs", "ncp", "nfs",
"nfs4", "gfs", "gfs2", "glusterfs", "gpfs", "pvfs2", "ocfs2",
"lustre", "davfs", "afs". They all are network-mounted filesystems and
there are no questions about their remoteness or performance impact.
(Although, they still can bear threats to the system, so it might be
questionable if we should blindly exclude them from the scan. But
that's another topic).

Then there are "proc" and "sysfs" filesystems, which are not really
remote, but at the same moment not really filesystems. They are
projections of internal databases (sometimes very big ones), and not
really meant to be recursively traversed. Are we eligible to exclude
them as well?

Another interesting filesystem type is the OCI (Docker/Podman)
"overlay" (/var/lib/containers/storage/overlay/), which is not really
local, but in another plane: it is mounted from virtualized
environments. And lots of containers might have a huge impact on the
performance of the host scan.

So, my question is, how exactly do we define "local" and "non-local"
filesystems?

Best regards,
Evgenii Kolesnikov

Senior Software Engineer
Security Compliance
Red Hat EMEA
https://www.redhat.com

Hi! Recently we, OpenSCAP and Compliance as Code, tried to facilitate some performance optimizations for rules that traverse the whole filesystem of the target. In general these rules are about looking for files of certain characteristics rather than collecting info about a group in a known location, hence the full scan. Like, for example, a list of all suid-enabled files.That usually means the object is described with <filepath>/</filepath> and <behaviors recurse_file_system=local recurse_direction=down>. The recurse_file_system=local was designed to help in avoiding unnecessary performance problems while commencing queries of that kind. But it is not really clear to what extent the scanner should take this optimization. For example OpenSCAP understand these filesystems types as non-local: "ceph", "cifs", "smb3", "smbfs", "sshfs", "ncpfs", "ncp", "nfs", "nfs4", "gfs", "gfs2", "glusterfs", "gpfs", "pvfs2", "ocfs2", "lustre", "davfs", "afs". They all are network-mounted filesystems and there are no questions about their remoteness or performance impact. (Although, they still can bear threats to the system, so it might be questionable if we should blindly exclude them from the scan. But that's another topic). Then there are "proc" and "sysfs" filesystems, which are not really remote, but at the same moment not really filesystems. They are projections of internal databases (sometimes very big ones), and not really meant to be recursively traversed. Are we eligible to exclude them as well? Another interesting filesystem type is the OCI (Docker/Podman) "overlay" (/var/lib/containers/storage/overlay/), which is not really local, but in another plane: it is mounted from virtualized environments. And lots of containers might have a huge impact on the performance of the host scan. So, my question is, how exactly do we define "local" and "non-local" filesystems? Best regards, Evgenii Kolesnikov Senior Software Engineer Security Compliance Red Hat EMEA https://www.redhat.com
DS
David Solin
Sun, Apr 11, 2021 2:51 PM

Hi Evgeny,

This is an underspecified part of the specification, for which we’ve tried a few different approaches.  Part of the problem is that the file-system type can be (more or less) arbitrary strings, and there is no way to conclusively enumerate them.

Our original approach to solving this problem was to say that whatever the filesystem type was for ‘/‘, was the “local” type.  So for local recursion, we’d filter using that type.  More recently, we’ve come to use the -l switch in the df command to identify the “local” filesystems.  We feel this is probably a better reflection of the intention behind the local attribute value.  Either of the above approaches will exclude special filesystems like /proc.

You have a good point about the overlay filesystem; I do not know offhand whether that filesystem reliably has a particular type (is it always “overlay"?).  I am fairly certain, however, that the local attribute pre-dates the birth of Docker, and was intended only to prevent searches of NFS-mounted drives.  We should come up with a more conclusive determination of its meaning today.

Best regards,
—David Solin

On Apr 9, 2021, at 12:05 AM, Evgeny Kolesnikov ekolesni@redhat.com wrote:

Hi!

Recently we, OpenSCAP and Compliance as Code, tried to facilitate some
performance optimizations for rules that traverse the whole filesystem
of the target.

In general these rules are about looking for files of certain
characteristics rather than collecting info about a group in a known
location, hence the full scan. Like, for example, a list of all
suid-enabled files.That usually means the object is described with
<filepath>/</filepath> and <behaviors recurse_file_system=local recurse_direction=down>.

The recurse_file_system=local was designed to help in avoiding
unnecessary performance problems while commencing queries of that
kind. But it is not really clear to what extent the scanner should
take this optimization.

For example OpenSCAP understand these filesystems types as non-local:
"ceph", "cifs", "smb3", "smbfs", "sshfs", "ncpfs", "ncp", "nfs",
"nfs4", "gfs", "gfs2", "glusterfs", "gpfs", "pvfs2", "ocfs2",
"lustre", "davfs", "afs". They all are network-mounted filesystems and
there are no questions about their remoteness or performance impact.
(Although, they still can bear threats to the system, so it might be
questionable if we should blindly exclude them from the scan. But
that's another topic).

Then there are "proc" and "sysfs" filesystems, which are not really
remote, but at the same moment not really filesystems. They are
projections of internal databases (sometimes very big ones), and not
really meant to be recursively traversed. Are we eligible to exclude
them as well?

Another interesting filesystem type is the OCI (Docker/Podman)
"overlay" (/var/lib/containers/storage/overlay/), which is not really
local, but in another plane: it is mounted from virtualized
environments. And lots of containers might have a huge impact on the
performance of the host scan.

So, my question is, how exactly do we define "local" and "non-local"
filesystems?

Best regards,
Evgenii Kolesnikov

Senior Software Engineer
Security Compliance
Red Hat EMEA
https://www.redhat.com


OVAL_Developer mailing list -- oval_developer@lists.cisecurity.org
To unsubscribe send an email to oval_developer-leave@lists.cisecurity.org

Hi Evgeny, This is an underspecified part of the specification, for which we’ve tried a few different approaches. Part of the problem is that the file-system type can be (more or less) arbitrary strings, and there is no way to conclusively enumerate them. Our original approach to solving this problem was to say that whatever the filesystem type was for ‘/‘, was the “local” type. So for local recursion, we’d filter using that type. More recently, we’ve come to use the -l switch in the df command to identify the “local” filesystems. We feel this is probably a better reflection of the intention behind the local attribute value. Either of the above approaches will exclude special filesystems like /proc. You have a good point about the overlay filesystem; I do not know offhand whether that filesystem reliably has a particular type (is it always “overlay"?). I am fairly certain, however, that the local attribute pre-dates the birth of Docker, and was intended only to prevent searches of NFS-mounted drives. We should come up with a more conclusive determination of its meaning today. Best regards, —David Solin > On Apr 9, 2021, at 12:05 AM, Evgeny Kolesnikov <ekolesni@redhat.com> wrote: > > Hi! > > Recently we, OpenSCAP and Compliance as Code, tried to facilitate some > performance optimizations for rules that traverse the whole filesystem > of the target. > > In general these rules are about looking for files of certain > characteristics rather than collecting info about a group in a known > location, hence the full scan. Like, for example, a list of all > suid-enabled files.That usually means the object is described with > <filepath>/</filepath> and <behaviors recurse_file_system=local > recurse_direction=down>. > > The recurse_file_system=local was designed to help in avoiding > unnecessary performance problems while commencing queries of that > kind. But it is not really clear to what extent the scanner should > take this optimization. > > For example OpenSCAP understand these filesystems types as non-local: > "ceph", "cifs", "smb3", "smbfs", "sshfs", "ncpfs", "ncp", "nfs", > "nfs4", "gfs", "gfs2", "glusterfs", "gpfs", "pvfs2", "ocfs2", > "lustre", "davfs", "afs". They all are network-mounted filesystems and > there are no questions about their remoteness or performance impact. > (Although, they still can bear threats to the system, so it might be > questionable if we should blindly exclude them from the scan. But > that's another topic). > > Then there are "proc" and "sysfs" filesystems, which are not really > remote, but at the same moment not really filesystems. They are > projections of internal databases (sometimes very big ones), and not > really meant to be recursively traversed. Are we eligible to exclude > them as well? > > Another interesting filesystem type is the OCI (Docker/Podman) > "overlay" (/var/lib/containers/storage/overlay/), which is not really > local, but in another plane: it is mounted from virtualized > environments. And lots of containers might have a huge impact on the > performance of the host scan. > > So, my question is, how exactly do we define "local" and "non-local" > filesystems? > > Best regards, > Evgenii Kolesnikov > > Senior Software Engineer > Security Compliance > Red Hat EMEA > https://www.redhat.com > _______________________________________________ > OVAL_Developer mailing list -- oval_developer@lists.cisecurity.org > To unsubscribe send an email to oval_developer-leave@lists.cisecurity.org
EK
Evgeny Kolesnikov
Mon, Apr 12, 2021 7:42 AM

Hey David!

Our original approach to solving this problem was to say that whatever the filesystem type was for ‘/‘, was the “local” type.  So for local recursion, we’d filter using that type.

I thought that this scenario is covered by the 'defined' value. Should
it also be a part of the 'local' behaviour?

More recently, we’ve come to use the -l switch in the df command to identify the “local” filesystems.  We feel this is probably a better reflection of the intention behind the local attribute value.  Either of the above approaches will exclude special filesystems like /proc.

The coreutils' df tool has two types of filesystems it filters out:
dummy[1] and remote[2]. Both 'proc' and 'sysfs' are dummies, and they
are displayed by the df tool only when -a (all) option is present. So,
I guess it would be correct for OpenSCAP to also support a list of
dummy filesystems to skip in 'local' mode.

Still, I think it would be better to at least mention this approach in
the spec, despite it not being a formal list of filesystems.

You have a good point about the overlay filesystem; I do not know offhand whether that filesystem reliably has a particular type (is it always “overlay"?).  I am fairly certain, however, that the local attribute pre-dates the birth of Docker, and was intended only to prevent searches of NFS-mounted drives.  We should come up with a more conclusive determination of its meaning today.

Unfortunately it is a regular "overlay" type, and moreover the df tool
think that it is a local filesystem:

df -l

Filesystem                              1K-blocks      Used Available
Use% Mounted on
...
shm                                        64000        0    64000
0% /var/lib/containers/storage/overlay-containers/b992d5a015e84d8f336c1cf19dc080ea1c1ba0dc60c538c3774c253032e31bb2/userdata/shm
overlay                                  71711864  39340860  28685220
58% /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged

mount

...
overlay on /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged
type overlay (rw,nodev,relatime,context="system_u:object_r:container_file_t:s0:c320,c894",lowerdir=/var/lib/containers/storage/overlay/l/QJBWQN7FISPOYVK2S5PWYANC2U,upperdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/diff,workdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/work)

No idea how to formally differentiate it from other overlays at this moment.

Best regards,
Evgenii Kolesnikov
Red Hat EMEA.

[1] https://github.com/coreutils/gnulib/blob/2a597a7ae16fb988daa473f8df2468d7188c118a/lib/mountlist.c#L164
[2] https://github.com/coreutils/gnulib/blob/2a597a7ae16fb988daa473f8df2468d7188c118a/lib/mountlist.c#L223

Hey David! > Our original approach to solving this problem was to say that whatever the filesystem type was for ‘/‘, was the “local” type. So for local recursion, we’d filter using that type. I thought that this scenario is covered by the 'defined' value. Should it also be a part of the 'local' behaviour? > More recently, we’ve come to use the -l switch in the df command to identify the “local” filesystems. We feel this is probably a better reflection of the intention behind the local attribute value. Either of the above approaches will exclude special filesystems like /proc. The coreutils' df tool has two types of filesystems it filters out: dummy[1] and remote[2]. Both 'proc' and 'sysfs' are dummies, and they are displayed by the df tool only when -a (all) option is present. So, I guess it would be correct for OpenSCAP to also support a list of dummy filesystems to skip in 'local' mode. Still, I think it would be better to at least mention this approach in the spec, despite it not being a formal list of filesystems. > You have a good point about the overlay filesystem; I do not know offhand whether that filesystem reliably has a particular type (is it always “overlay"?). I am fairly certain, however, that the local attribute pre-dates the birth of Docker, and was intended only to prevent searches of NFS-mounted drives. We should come up with a more conclusive determination of its meaning today. Unfortunately it is a regular "overlay" type, and moreover the df tool think that it is a local filesystem: # df -l Filesystem 1K-blocks Used Available Use% Mounted on ... shm 64000 0 64000 0% /var/lib/containers/storage/overlay-containers/b992d5a015e84d8f336c1cf19dc080ea1c1ba0dc60c538c3774c253032e31bb2/userdata/shm overlay 71711864 39340860 28685220 58% /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged # mount ... overlay on /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged type overlay (rw,nodev,relatime,context="system_u:object_r:container_file_t:s0:c320,c894",lowerdir=/var/lib/containers/storage/overlay/l/QJBWQN7FISPOYVK2S5PWYANC2U,upperdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/diff,workdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/work) No idea how to formally differentiate it from other overlays at this moment. Best regards, Evgenii Kolesnikov Red Hat EMEA. [1] https://github.com/coreutils/gnulib/blob/2a597a7ae16fb988daa473f8df2468d7188c118a/lib/mountlist.c#L164 [2] https://github.com/coreutils/gnulib/blob/2a597a7ae16fb988daa473f8df2468d7188c118a/lib/mountlist.c#L223
DS
David Solin
Mon, Apr 12, 2021 12:54 PM

Hi Evgeny, see below...

On Apr 12, 2021, at 2:42 AM, Evgeny Kolesnikov ekolesni@redhat.com wrote:

Hey David!

Our original approach to solving this problem was to say that whatever the filesystem type was for ‘/‘, was the “local” type.  So for local recursion, we’d filter using that type.

I thought that this scenario is covered by the 'defined' value. Should
it also be a part of the 'local' behaviour?

‘defined’ is a little different.  What I described above operates on the filesystem type, and therefore, permits traversal between filesystems as long as they’re the same type.  What “defined” means (from the schema documentation) follows:
"The value of 'defined' keeps any recursion within the file system that the file_object (path+filename or filepath) has specified.”

So, if you had defined set, recursion should not transition between filesystems, e.g., you wouldn’t be able to find a file in /tmp if you started from /.

More recently, we’ve come to use the -l switch in the df command to identify the “local” filesystems.  We feel this is probably a better reflection of the intention behind the local attribute value.  Either of the above approaches will exclude special filesystems like /proc.

The coreutils' df tool has two types of filesystems it filters out:
dummy[1] and remote[2]. Both 'proc' and 'sysfs' are dummies, and they
are displayed by the df tool only when -a (all) option is present. So,
I guess it would be correct for OpenSCAP to also support a list of
dummy filesystems to skip in 'local' mode.

Still, I think it would be better to at least mention this approach in
the spec, despite it not being a formal list of filesystems.

I quite agree!

You have a good point about the overlay filesystem; I do not know offhand whether that filesystem reliably has a particular type (is it always “overlay"?).  I am fairly certain, however, that the local attribute pre-dates the birth of Docker, and was intended only to prevent searches of NFS-mounted drives.  We should come up with a more conclusive determination of its meaning today.

Unfortunately it is a regular "overlay" type, and moreover the df tool
think that it is a local filesystem:

df -l

Filesystem                              1K-blocks      Used Available
Use% Mounted on
...
shm                                        64000        0    64000
0% /var/lib/containers/storage/overlay-containers/b992d5a015e84d8f336c1cf19dc080ea1c1ba0dc60c538c3774c253032e31bb2/userdata/shm
overlay                                  71711864  39340860  28685220
58% /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged

mount

...
overlay on /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged
type overlay (rw,nodev,relatime,context="system_u:object_r:container_file_t:s0:c320,c894",lowerdir=/var/lib/containers/storage/overlay/l/QJBWQN7FISPOYVK2S5PWYANC2U,upperdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/diff,workdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/work)

No idea how to formally differentiate it from other overlays at this moment.

We could introduce a new recurse behavior to control traversal into overlay filesystems — but, what other overlays are there?

Hi Evgeny, see below... > On Apr 12, 2021, at 2:42 AM, Evgeny Kolesnikov <ekolesni@redhat.com> wrote: > > Hey David! > >> Our original approach to solving this problem was to say that whatever the filesystem type was for ‘/‘, was the “local” type. So for local recursion, we’d filter using that type. > > I thought that this scenario is covered by the 'defined' value. Should > it also be a part of the 'local' behaviour? > ‘defined’ is a little different. What I described above operates on the filesystem type, and therefore, permits traversal between filesystems as long as they’re the same type. What “defined” means (from the schema documentation) follows: "The value of 'defined' keeps any recursion within the file system that the file_object (path+filename or filepath) has specified.” So, if you had defined set, recursion should not transition between filesystems, e.g., you wouldn’t be able to find a file in /tmp if you started from /. >> More recently, we’ve come to use the -l switch in the df command to identify the “local” filesystems. We feel this is probably a better reflection of the intention behind the local attribute value. Either of the above approaches will exclude special filesystems like /proc. > > The coreutils' df tool has two types of filesystems it filters out: > dummy[1] and remote[2]. Both 'proc' and 'sysfs' are dummies, and they > are displayed by the df tool only when -a (all) option is present. So, > I guess it would be correct for OpenSCAP to also support a list of > dummy filesystems to skip in 'local' mode. > > Still, I think it would be better to at least mention this approach in > the spec, despite it not being a formal list of filesystems. I quite agree! > >> You have a good point about the overlay filesystem; I do not know offhand whether that filesystem reliably has a particular type (is it always “overlay"?). I am fairly certain, however, that the local attribute pre-dates the birth of Docker, and was intended only to prevent searches of NFS-mounted drives. We should come up with a more conclusive determination of its meaning today. > > Unfortunately it is a regular "overlay" type, and moreover the df tool > think that it is a local filesystem: > > # df -l > Filesystem 1K-blocks Used Available > Use% Mounted on > ... > shm 64000 0 64000 > 0% /var/lib/containers/storage/overlay-containers/b992d5a015e84d8f336c1cf19dc080ea1c1ba0dc60c538c3774c253032e31bb2/userdata/shm > overlay 71711864 39340860 28685220 > 58% /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged > > # mount > ... > overlay on /var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/merged > type overlay (rw,nodev,relatime,context="system_u:object_r:container_file_t:s0:c320,c894",lowerdir=/var/lib/containers/storage/overlay/l/QJBWQN7FISPOYVK2S5PWYANC2U,upperdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/diff,workdir=/var/lib/containers/storage/overlay/dda1b6d9ac2dfe3df0dca73e9cb0efdac11ba8b29fe1845bf2e455cf15f98f44/work) > > No idea how to formally differentiate it from other overlays at this moment. We could introduce a new recurse behavior to control traversal into overlay filesystems — but, what other overlays are there? > > Best regards, > Evgenii Kolesnikov > Red Hat EMEA. > > [1] https://github.com/coreutils/gnulib/blob/2a597a7ae16fb988daa473f8df2468d7188c118a/lib/mountlist.c#L164 > [2] https://github.com/coreutils/gnulib/blob/2a597a7ae16fb988daa473f8df2468d7188c118a/lib/mountlist.c#L223 >