oval_developer@lists.cisecurity.org

A list for people interested in developing the OVAL language.

View all threads

macos:plist_test/string values containing characters illegal in XML

DS
David Solin
Tue, Oct 4, 2016 6:34 PM

The following plist file on MacOSX is an XML plist with contents in the UTF-8 character set, but it contains characters that are not legal in “real” XML (e.g., U-0000):

/System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/HIServices.framework/Versions/Current/Resources/he.lproj/icdefaults.plist

The question I have is, how should an OVAL interpreter represent these illegal character values in XML 1.0?  I attempted replacing the illegal characters in-stream with their XML entity (character reference) equivalents (e.g., “�”), as I expected this would be an acceptable means of expressing those characters in an XML document, but it turns out these are converted into the actual Unicode characters and then declared illegal.

The command "defaults read /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/HIServices.framework/Versions/Current/Resources/he.lproj/icdefaults.plist CharacterSet” produces the output:

(
"\001\002\003\004\005\006\a\b\t\n\v\f\n\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037 !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_abcdefghijklmnopqrstuvwxyz{|}~\177\\U2022\\U2122\\U2260\\U221e\\U2265\\U2211\\U222b\\U03a9\\U221a\\U2248\\U2026\\U2014\\U2018\\U0178\\U2044\\U2202\\U2206\\U0152\\U201a\\U201e\\U2030\\Uf8ff\\U02c6\\U02dc\\U02d8\\U02d9\\U02da\\U02dd\\U02db\\U02c7\\U0131\\U0192\\U00a0\\U00a1\\U00a2\\U00a3\\U20ac\\U00a5\\U0153\\U00a7\\U00a8\\U00a9\\U00aa\\U00ab\\U00ac\\U2013\\U00ae\\U00af\\U00b0\\U00b1\\U201d\\U201c\\U00b4\\U00b5\\U00b6\\U00b7\\U00b8\\U2019\\U00ba\\U00bb\\U03c0\\U220f\\U2264\\U00bf\\U00c0\\U00c1\\U00c2\\U00c3\\U00c4\\U00c5\\U00c6\\U00c7\\U00c8\\U00c9\\U00ca\\U00cb\\U00cc\\U00cd\\U00ce\\U00cf\\U2039\\U00d1\\U00d2\\U00d3\\U00d4\\U00d5\\U00d6\\U25ca\\U00d8\\U00d9\\U00da\\U00db\\U00dc\\U2020\\Ufb01\\U00df\\U00e0\\U00e1\\U00e2\\U00e3\\U00e4\\U00e5\\U00e6\\U00e7\\U00e8\\U00e9\\U00ea\\U00eb\\U00ec\\U00ed\\U00ee\\U00ef\\U203a\\U00f1\\U00f2\\U00f3\\U00f4\\U00f5\\U00f6\\U00f7\\U00f8\\U00f9\\U00fa\\U00fb\\U00fc\\U2021\\Ufb02\\U00ff", "\001\002\003\004\005\006\\a\\b\\t\\n\\v\\f\\n\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037 !\\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\\\]^_abcdefghijklmnopqrstuvwxyz{|}~\177\U0192\U2248\U00ab\U2026\U2014\U00f7\U2039\U00b7\U2021\U201a\U2030\U201e\U00c2\U00c1\U00c8\U00cb\U00cd\U00ce\U00cc\U00cf\U00d3\U00d4\U00d2\U00db\U00da\U00d9\U02c6\U0131\U02d9\U02d8\U02da\U00b8\U203a\U221e\U00a2\U00a3\U00df\U00c4\U2202\Ufb02\U00c6\U00a9\U00c5\U00a5\U00ae\U00c7\U2206\U00ff\U00c9\U00b1\U00e6\U00d1\U2022\U00b5\U00e8\U00d6\U03a9\U00ba\U00dc\U2122\U222b\U00e1\U00ca\U00af\U00f8\U00b0\U00a8\U00e0\U00fc\U00e2\U00ea\U00b4\U00aa\U00e4\U2020\U00bf\U221a\U2019\U00eb\U00b6\U2260\U00e3\U2265\U2264\U00e5\U03c0\U02dc\U25ca\U02c7\U00e7\U00e9\U00a7\U2013\Uf8ff\Ufb01\U02db\U02dd\U2211\U00ed\U00ec\U00ee\U00ac\U00a0\U00a1\U00c0\U00bb\U00d5\U0152\U0153\U00c3\U201d\U2018\U00ef\U201c\U2044\U20ac\U0178\U00fb\U00f1\U00f3\U00d8\U00f2\U00f4\U00f6\U220f\U00f5\U00fa\U00f9"
)

(i.e., an array of two pretty weird strings).

So… does anyone have any suggestions about encoding values like \001 in an XML document?

Best regards,
—David Solin

David A. Solin
Co-Founder, Research & Technology
solin@jovalcm.com mailto:solin@jovalcm.com
http://jovalcm.com/
  https://www.facebook.com/jovalcm https://www.linkedin.com/company/joval-continuous-monitoring

...

The following plist file on MacOSX is an XML plist with contents in the UTF-8 character set, but it contains characters that are not legal in “real” XML (e.g., U-0000): /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/HIServices.framework/Versions/Current/Resources/he.lproj/icdefaults.plist The question I have is, how should an OVAL interpreter represent these illegal character values in XML 1.0? I attempted replacing the illegal characters in-stream with their XML entity (character reference) equivalents (e.g., “&#x0;”), as I expected this would be an acceptable means of expressing those characters in an XML document, but it turns out these are converted into the actual Unicode characters and then declared illegal. The command "defaults read /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/HIServices.framework/Versions/Current/Resources/he.lproj/icdefaults.plist CharacterSet” produces the output: ( "\001\002\003\004\005\006\\a\\b\\t\\n\\v\\f\\n\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037 !\\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\177\\U2022\\U2122\\U2260\\U221e\\U2265\\U2211\\U222b\\U03a9\\U221a\\U2248\\U2026\\U2014\\U2018\\U0178\\U2044\\U2202\\U2206\\U0152\\U201a\\U201e\\U2030\\Uf8ff\\U02c6\\U02dc\\U02d8\\U02d9\\U02da\\U02dd\\U02db\\U02c7\\U0131\\U0192\\U00a0\\U00a1\\U00a2\\U00a3\\U20ac\\U00a5\\U0153\\U00a7\\U00a8\\U00a9\\U00aa\\U00ab\\U00ac\\U2013\\U00ae\\U00af\\U00b0\\U00b1\\U201d\\U201c\\U00b4\\U00b5\\U00b6\\U00b7\\U00b8\\U2019\\U00ba\\U00bb\\U03c0\\U220f\\U2264\\U00bf\\U00c0\\U00c1\\U00c2\\U00c3\\U00c4\\U00c5\\U00c6\\U00c7\\U00c8\\U00c9\\U00ca\\U00cb\\U00cc\\U00cd\\U00ce\\U00cf\\U2039\\U00d1\\U00d2\\U00d3\\U00d4\\U00d5\\U00d6\\U25ca\\U00d8\\U00d9\\U00da\\U00db\\U00dc\\U2020\\Ufb01\\U00df\\U00e0\\U00e1\\U00e2\\U00e3\\U00e4\\U00e5\\U00e6\\U00e7\\U00e8\\U00e9\\U00ea\\U00eb\\U00ec\\U00ed\\U00ee\\U00ef\\U203a\\U00f1\\U00f2\\U00f3\\U00f4\\U00f5\\U00f6\\U00f7\\U00f8\\U00f9\\U00fa\\U00fb\\U00fc\\U2021\\Ufb02\\U00ff", "\001\002\003\004\005\006\\a\\b\\t\\n\\v\\f\\n\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037 !\\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\177\\U0192\\U2248\\U00ab\\U2026\\U2014\\U00f7\\U2039\\U00b7\\U2021\\U201a\\U2030\\U201e\\U00c2\\U00c1\\U00c8\\U00cb\\U00cd\\U00ce\\U00cc\\U00cf\\U00d3\\U00d4\\U00d2\\U00db\\U00da\\U00d9\\U02c6\\U0131\\U02d9\\U02d8\\U02da\\U00b8\\U203a\\U221e\\U00a2\\U00a3\\U00df\\U00c4\\U2202\\Ufb02\\U00c6\\U00a9\\U00c5\\U00a5\\U00ae\\U00c7\\U2206\\U00ff\\U00c9\\U00b1\\U00e6\\U00d1\\U2022\\U00b5\\U00e8\\U00d6\\U03a9\\U00ba\\U00dc\\U2122\\U222b\\U00e1\\U00ca\\U00af\\U00f8\\U00b0\\U00a8\\U00e0\\U00fc\\U00e2\\U00ea\\U00b4\\U00aa\\U00e4\\U2020\\U00bf\\U221a\\U2019\\U00eb\\U00b6\\U2260\\U00e3\\U2265\\U2264\\U00e5\\U03c0\\U02dc\\U25ca\\U02c7\\U00e7\\U00e9\\U00a7\\U2013\\Uf8ff\\Ufb01\\U02db\\U02dd\\U2211\\U00ed\\U00ec\\U00ee\\U00ac\\U00a0\\U00a1\\U00c0\\U00bb\\U00d5\\U0152\\U0153\\U00c3\\U201d\\U2018\\U00ef\\U201c\\U2044\\U20ac\\U0178\\U00fb\\U00f1\\U00f3\\U00d8\\U00f2\\U00f4\\U00f6\\U220f\\U00f5\\U00fa\\U00f9" ) (i.e., an array of two pretty weird strings). So… does anyone have any suggestions about encoding values like \001 in an XML document? Best regards, —David Solin David A. Solin Co-Founder, Research & Technology solin@jovalcm.com <mailto:solin@jovalcm.com> <http://jovalcm.com/>   <https://www.facebook.com/jovalcm> <https://www.linkedin.com/company/joval-continuous-monitoring> ...