[OVAL DEVELOPER] regex_capture quantified subpattern behavior clarification

David Solin solin at jovalcm.com
Mon Aug 8 10:10:26 EDT 2016

Hi Jan,

The preceding paragraph in the RegexCaptureFunctionType schema documentation states:

“If the regular expression contains multiple capturing sub-patterns, only the first capture is used. If there are no capturing sub-patterns, the result for each target string must be the empty string. Otherwise, if the regular expression could match the target string in more than one place, only the first match (and its first capture) is used. If no matches are found in a target string, the result for that target must be the empty string.”

This indicates that, given the content you specified, the FIRST match should be the result (i.e., “fs.suid_dumpable = 1”).

An example of a “quantified capturing sub-pattern” whose capturing sub-pattern would match multiple times (thus yielding the last match) in order for the overall pattern to match is:

“^.*(fs\.suid_dumpable = \d).*$”

If you use that value as the pattern, in fact, the very last match will be returned!

Best regards,
—David A. Solin


> On Aug 8, 2016, at 5:14 AM, Jan Lieskovsky <jlieskov at REDHAT.COM> wrote:
> Hello OVAL Developers,
>  there's the following documentation section in RegexCaptureFunctionType
> description:
> "Note that a quantified capturing sub-pattern does not produce multiple
> substrings.  Standard regular expression semantics are such that if a
> capturing sub-pattern is required to match multiple times in order for
> the overall regular expression to match, the capture produced is the
> last substring to have matched the sub-pattern."
> (from https://github.com/OVALProject/Language/blob/master/specifications/oval-language-specification.docx )
> If I am reading the above section correctly, in the case there are multiple
> "substrings" within the text, the regex_capture pattern could match against,
> and pattern quantification is used, the last matched item should be collected
> / returned by the scanner.
> But checking this behaviour in the OpenSCAP scanner, always the first matched
> instance is returned (regardless if pattern quantification was used / specified
> or not).
> Suppose the attached example OVAL file.
> Unless I have misunderstood something, the regex_capture()'s collected value
> should be the "fs.suid_dumpable = 4" (IOW the last one), not the
> "fs.suid_dumpable = 1", like it's done currently, right? IMHO last one should
> be collected, since quantified sub-pattern was used in regex_capture specification.
> Is this correct? Or I have overlooked something? If the latter, could you
> hopefully provide an example of an pattern, when "regex_capture" would return
> last substring that matched the sub-pattern, as specified in the specification?
> Thank you && Regards, Jan
> --
> Jan iankko Lieskovsky / Red Hat Security Technologies Team
> ...<regex_capture_test.xml>_______________________________________________
> OVAL_Developer mailing list
> OVAL_Developer at lists.cisecurity.org
> http://lists.cisecurity.org/mailman/listinfo/oval_developer_lists.cisecurity.org


More information about the OVAL_Developer mailing list