Introduction to SCAP datastreams with openscap

What are datastreams?

Datastream can be thought of as an archive of interlinked SCAP content (XCCDF, OVAL, CPE, …). It was introduced in SCAP 1.2 standard. In its purpose it is similar to SCAP bundle known from previous SCAP version and datastreams are expected to replace SCAP bundles. The whole concept is fairly simple but is made confusing because of unfortunate choice of nomenclature. Lets disambiguate the terms first.

There are two types of datastreams: source datastreams (SDS) and result datastreams (ARF). The idea is that the scanner takes a source datastream, evaluates it and gives results in the result datastream format back. What makes things confusing is that each source datastream has a root element called “data-stream-collection” that contains 1 or more elements called “data-stream” inside it! So the datastream is actually a collection of datastreams…[1] The way I think about this is that each source datastream has 1 or more modes in which it can be evaluated. Most datastreams will only have one.

Support in openscap

Datastreams have been supported since 0.8.5 but the code has been continuously improved since then. This guide assumes that you are using the latest openscap, namely version 0.9.3.

Having an up to date libxml2 is also important, libxml2-2.7 or newer is recommended because of various XSD validation bugs in older versions.

Why should I care?

Despite looking like fancy wrapping around SCAP content, datastreams do have extra features that may be valuable to your use case. People creating content might be interested in signing it (uses public key cryptography). [1] [2] [3] Content consumers might like the ease of use and added flexibility.

Another reason is that SCAP datastreams are slowly becoming the recommended way to distribute all SCAP content. I would not be surprised if separate XCCDF and OVAL files are only used for content creation in the coming years.

Evaluating the datastream

Evaluation of datastream means evaluation of either one of the XCCDFs or OVALs in it. The whole process is a bit more involved because of various links and references in the datastream but openscap kindly hides this complexity for us.

Confirming content type

A recommended first step is to make sure the content is what is expected. The oscap tool has a convenient command for this called oscap info.

This will print information about the content, including referenced XCCDF and OVAL components. 

$ oscap info datastream.xml
Document type: Source Data Stream
Imported: 2012-12-18T13:46:13

Stream: scap_org.open-scap_example_datastream
Generated: (null)
Version: 1.2
Checklists:
        Ref-Id: scap_org.open-scap_cref_xccdf.xml
                Profile: xccdf_cdf_profile_Default
Checks:
        Ref-Id: scap_org.open-scap_cref_oval.xml
No dictionaries.

XCCDF inside SDS evaluation

$ oscap xccdf eval datastream.xml

OVAL inside SDS evaluation

$ oscap oval eval datastream.xml
openscap automatically detects that given file is a datastream and acts accordingly, splitting it into separate files before evaluation. You can also split it manually and evaluate it in the conventional way, see the next section.
In cases where the datastream contains more OVAL components or more XCCDFs, the user can choose which one to evaluate. 

XCCDF with given ID inside SDS evaluation

$ oscap xccdf eval --xccdf-id COMPONENT_ID datastream.xml

It is encouraged to use oscap info as seen in one of the previous sections to print out component-ref IDs and use these for the command-line options.

Similarly with OVAL. Please review the oscap man page for more details about component ref selection (–xccdf-id, –oval-id and –datastream-id).

More options

You can use all the other options you are accustomed to when evaluating a single XCCDF file: selecting profiles in the XCCDF, creating results and guides, … See the XCCDF section of the oscap man page for more details.

Result datastream

If you are evaluating a source datastream it is very likely that you want a result datastream back. There is a special command-line switch to achieve that called –results-arf, it behaves similarly to –results but writes out the file in ARF format instead of the XCCDF results format. ARF stands for Asset Reporting Format and is the standard for result datastreams. [1]

$ oscap xccdf eval --results-arf result.xml datastream.xml

Splitting the stream manually

In some cases it is more suitable to split the datastream into multiple files and work with these. Especially if you already have an established workflow around these files. openscap allows you to do this via its oscap command line tool.
$ oscap ds sds-split datastream.xml target_folder/

If the content inside the datastream requires it, subfolders will be created in the target folder. As was the case previously, you might want to specify –datastream-id and/or –xccdf-id if there are more XCCDFs in the datastream you are splitting. Otherwise, the first suitable XCCDF is used.

Creating a datastream from separate files

openscap also allows you to create a datastream from separate files. The implementation only allows this for the most common case of one XCCDF with one or more OVAL files referenced. The referenced OVAL files are automatically detected and packed in.

$ oscap ds sds-compose some-xccdf.xml target-datastream.xml

References, further reading