<section title="21.3. Content Filtering"><subsection title="Objective"><paragraph
    title="21.3.1."


><![CDATA[<p>The flow&nbsp;of data within gateways is examined and controls applied in accordance with the agency’s security policy. &nbsp;To prevent unauthorised or malicious content crossing security domain boundaries.</p>]]></paragraph>
 </subsection>
<subsection title="Context"> <block title="Scope"><paragraph
    title="21.3.2."


><![CDATA[<p>This section covers information relating to the use of content filters within bi-directional or one-way gateways in order to protect security domains.</p>]]></paragraph>
<paragraph
    title="21.3.3."


><![CDATA[<p>Content filters reduce the risk of unauthorised or malicious content crossing a security domain boundary.</p>]]></paragraph>
</block>
</subsection>
<subsection title="Rationale &amp; Controls"> <block title="Limiting transfers by file type"><paragraph
    title="21.3.4.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>The level of security risk will be affected by the degree of assurance agencies can place in the ability of their data transfer filters to:</p><ul>
<li>confirm the file type by examination of the contents of the file;</li>
<li>confirm the absence of malicious content;</li>
<li>confirm the absence of inappropriate content;</li>
<li>confirm the classification of the content; and</li>
<li>handle compressed files appropriately.</li>
</ul><p>Reducing the number of allowed file types reduces the number of potential vulnerabilities available for an attacker to exploit.</p>]]></paragraph>
<paragraph
    title="21.3.4.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="Secret, Confidential, Top Secret"
    compliance="Must"
    cid="4321"
><![CDATA[<p>Agencies MUST strictly define and limit the types of files that can be transferred based on business requirements and the results of a security risk assessment.</p>]]></paragraph>
<paragraph
    title="21.3.4.C.02."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4322"
><![CDATA[<p>Agencies SHOULD strictly define and limit the types of files that can be transferred based on business requirements and the results of a security risk assessment.</p>]]></paragraph>
</block>
<block title="Blocking active content"><paragraph
    title="21.3.5.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>Many files are executable and are potentially harmful if activated by a system user. &nbsp;Many static file type specifications allow active content to be embedded within the file, which increases the attack surface.</p>]]></paragraph>
<paragraph
    title="21.3.5.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="Secret, Confidential, Top Secret"
    compliance="Must"
    cid="4325"
><![CDATA[<p>Agencies MUST block all executables and active content from entering a security domain.</p>]]></paragraph>
<paragraph
    title="21.3.5.C.02."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4326"
><![CDATA[<p>Agencies SHOULD block all executables and active content from being communicated though gateways.</p>]]></paragraph>
</block>
<block title="Blocking suspicious data"><paragraph
    title="21.3.6.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>The definition of suspicious content will depend on the system’s risk profile and what is considered normal traffic. &nbsp;The table below identifies some filtering techniques that can be used to identify suspicious data.</p><table class="table-main">
<tbody>
<tr>
<td>
<p>Technique</p>
</td>
<td>
<p>Purpose</p>
</td>
</tr>
<tr>
<td>
<p><strong>Antivirus scan</strong></p>
</td>
<td>
<p>Scans the data for viruses and other malicious code.</p>
</td>
</tr>
<tr>
<td>
<p><strong>Data format check</strong></p>
</td>
<td>
<p>Inspects data to ensure that it conforms to expected/permitted format(s).</p>
</td>
</tr>
<tr>
<td>
<p><strong>Data range check</strong></p>
</td>
<td>
<p>Checks the data within each field to ensure that it falls within the expected/permitted range.</p>
</td>
</tr>
<tr>
<td>
<p><strong>Data type check</strong></p>
</td>
<td>
<p>Inspects each file header to determine the file type.</p>
</td>
</tr>
<tr>
<td>
<p><strong>File extension check</strong></p>
</td>
<td>
<p>Checks file extensions to ensure that they are permitted.</p>
</td>
</tr>
<tr>
<td>
<p><strong>Keyword search</strong></p>
</td>
<td>
<p>Searches data for keywords or ‘dirty words’ that could indicate the presence of classified or inappropriate material.</p>
</td>
</tr>
<tr>
<td>
<p><strong>Metadata check</strong></p>
</td>
<td>
<p>Inspects files for metadata that should be removed prior to release.</p>
</td>
</tr>
<tr>
<td>
<p><strong>Protective marking check</strong></p>
</td>
<td>
<p>Validates the protective marking of the data to ensure that it complies with the permitted classifications and endorsements.</p>
</td>
</tr>
<tr>
<td>
<p><strong>Manual inspection</strong></p>
</td>
<td>
<p>The manual inspection of data for suspicious content that an automated system could miss, which is particularly important for the transfer of image files, multi-media or content-rich files.</p>
</td>
</tr>
</tbody>
</table>]]></paragraph>
<paragraph
    title="21.3.6.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Must"
    cid="4329"
><![CDATA[<p>Agencies MUST block, quarantine or drop any data identified by a data filter as suspicious until reviewed and approved for transfer by a trusted source other than the originator.</p>]]></paragraph>
</block>
<block title="Content validation"><paragraph
    title="21.3.7.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>Content validation aims to ensure that the content received conforms to a defined, approved standard. Content validation can be an effective means of identifying malformed content, allowing agencies to block potentially malicious content. Content validation operates on an allow listing principle, blocking all content except for that which is explicitly permitted. Examples of content validation include:</p>
<ul>
<li>ensuring numeric fields only contain numeric numbers;</li>
<li>other fields operate with defined character sets;</li>
<li>ensuring content falls within acceptable length boundaries;</li>
<li>ensuring XML documents are compared to a strictly defined XML schema.</li>
</ul>]]></paragraph>
<paragraph
    title="21.3.7.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="Confidential, Secret, Top Secret"
    compliance="Must"
    cid="4332"
><![CDATA[<p>Agencies MUST perform validation on all data passing through a content filter, blocking content which fails the validation.</p>]]></paragraph>
<paragraph
    title="21.3.7.C.02."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4333"
><![CDATA[<p>Agencies SHOULD perform validation on all data passing through a content filter, blocking content which fails the validation.</p>]]></paragraph>
</block>
<block title="Content conversion and transformation"><paragraph
    title="21.3.8.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>Content conversion, file conversion or file transformation can be an effective method to render potentially malicious content harmless by separating the presentation format from the data. By converting a file to another format, the exploit, active content and/or payload can often be removed or disrupted enough to be ineffective.<br>Examples of file conversion and content transformation to mitigate the threat of content exploitation include:</p>
<ul>
<li>converting a Microsoft Word document to a PDF file;</li>
<li>converting a Microsoft PowerPoint presentation to a series of JPEG images;</li>
<li>converting a Microsoft Excel spreadsheet to a Comma Separated Values (CSV) file; or</li>
<li>converting a PDF document to a plain text file.</li>
</ul>
<p>Some file types, such as XML, will not benefit from conversion. The conversion process should also be applied to any attachments or files contained within other files, for example, archive files or encoded files embedded in XML.</p>]]></paragraph>
<paragraph
    title="21.3.8.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4336"
><![CDATA[<p>Agencies SHOULD perform content conversion, file conversion or both for all ingress or egress data transiting a security domain boundary.</p>]]></paragraph>
</block>
<block title="Content sanitisation"><paragraph
    title="21.3.9.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>Sanitisation is the process of attempting to make potentially malicious content safe to use by removing or altering active content while leaving the original content as intact as possible. Sanitisation is not as secure a method of content filtering as conversion, though many techniques may be combined. Extraneous application and protocol data, including metadata, should also be inspected and filtered where possible. Examples of sanitisation to mitigate the threat of content exploitation include:</p>
<ul>
<li>removal of document properties information in Microsoft Office documents;</li>
<li>removal or renaming of JavaScript sections from PDF files;</li>
<li>removal of metadata such as EXIF information from within JPEG files.</li>
</ul>]]></paragraph>
<paragraph
    title="21.3.9.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4339"
><![CDATA[<p>Agencies SHOULD perform content and file sanitisation on suitable file types if content conversion or file conversion is not appropriate for data transiting a security domain boundary.</p>]]></paragraph>
</block>
<block title="Antivirus scans"><paragraph
    title="21.3.10.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>Antivirus scanning is used to prevent, detect and remove malicious software that includes computer viruses, worms, Trojans, spyware and adware.</p>]]></paragraph>
<paragraph
    title="21.3.10.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4348"
><![CDATA[<p>Agencies SHOULD perform antivirus scans on all content using up-to-date engines and signatures, using multiple different scanning engines.</p>]]></paragraph>
</block>
<block title="Archive and container files"><paragraph
    title="21.3.11.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>Archive and container files can be used to bypass content filtering processes if the content filter does not handle the file type and embedded content correctly. &nbsp;The content filtering process should recognise archived and container files, ensuring the embedded files they contain are subject to the same content filtering measures as un-archived files.</p>]]></paragraph>
<paragraph
    title="21.3.11.R.02."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>Archive files can be constructed in a manner which can pose a denial-of-service risk due to processor, memory or disk space exhaustion. &nbsp;To limit the risk of such an attack, content filters can specify resource constraints/quotas while extracting these files. &nbsp;If these constraints are exceeded the inspection is terminated, the content blocked and a security administrator alerted.</p>]]></paragraph>
<paragraph
    title="21.3.11.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4401"
><![CDATA[<p>Agencies SHOULD extract the contents from archive and container files and subject the extracted files to content filter tests.</p>]]></paragraph>
<paragraph
    title="21.3.11.C.02."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4402"
><![CDATA[<p>Agencies SHOULD perform controlled inspection of archive and container files to ensure that content filter performance and availability is not adversely affected.</p>]]></paragraph>
<paragraph
    title="21.3.11.C.03."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4403"
><![CDATA[<p>Agencies SHOULD block files that cannot be inspected and generate an alert or notification.</p>]]></paragraph>
</block>
<block title="Allow listing permitted content"><paragraph
    title="21.3.12.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>Creating and enforcing an allow list of allowed content/files is a strong content filtering method. &nbsp;Allowing content that satisfies a business requirement only can reduce the attack surface of the system. &nbsp;As a simple example, an email content filter might allow only Microsoft Office documents and PDF files.</p>]]></paragraph>
<paragraph
    title="21.3.12.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="Top Secret, Confidential, Secret"
    compliance="Must"
    cid="4406"
><![CDATA[<p>Agencies MUST create and enforce an allow list of permitted content types based on business requirements and the results of a security risk assessment.</p>]]></paragraph>
<paragraph
    title="21.3.12.C.02."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4407"
><![CDATA[<p>Agencies SHOULD create and enforce an allow list of permitted content types based on business requirements and the results of a security risk assessment.</p>]]></paragraph>
</block>
<block title="Data integrity"><paragraph
    title="21.3.13.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>Ensuring the authenticity and integrity of content reaching a security domain is a key component in ensuring its trustworthiness. It is also essential that content that has been authorised for release from a security domain is not modified or contains other data not authorised for release, for example by the addition or substitution of sensitive information.</p>]]></paragraph>
<paragraph
    title="21.3.13.R.02."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>If content passing through a filter contains a form of integrity protection, such as a digital signature, the content filter should verify the content’s integrity before allowing it through. If the content fails these integrity checks it may have been spoofed or tampered with and should be dropped or quarantined for further inspection.</p>
<p>Examples of data integrity checks include:</p>
<ul>
<li>an email server or content filter verifying an email protected by DKIM;</li>
<li>a web service verifying the XML digital signature contained within a SOAP request;</li>
<li>validating a file against a separately supplied hash;</li>
<li>checking that data to be exported from the security domain has been digitally signed by the release authority.</li>
</ul>]]></paragraph>
<paragraph
    title="21.3.13.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="Confidential, Secret, Top Secret"
    compliance="Must"
    cid="4411"
><![CDATA[<p>If data is signed, agencies MUST ensure that the signature is validated before the data is exported.</p>]]></paragraph>
<paragraph
    title="21.3.13.C.02."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4412"
><![CDATA[<p>Agencies SHOULD verify the integrity of content where applicable, and block the content if verification fails.</p>]]></paragraph>
</block>
<block title="Encrypted data"><paragraph
    title="21.3.14.R.01."

    tags="Data Management,Encryption,Technical,Content Filtering"


><![CDATA[<p>Encryption can be used to bypass content filtering if encrypted content cannot be subject to the same checks performed on unencrypted content. Agencies will need to consider the need to decrypt content, depending on:</p>
<ul>
<li>the security domain they are communicating with;</li>
<li>whether the need-to-know principle is to be enforced;</li>
<li>end-to-end encryption requirements; or</li>
<li>any privacy and policy requirements.</li>
</ul>]]></paragraph>
<paragraph
    title="21.3.14.R.02."

    tags="Data Management,Encryption,Technical,Content Filtering"


><![CDATA[<p>Choosing not to decrypt content poses a risk of encrypted malicious software communications and data moving between security domains. &nbsp;Additionally, encryption could mask the movement of information at a higher classification being allowed to pass to a security domain of lower classification, which could result in a data spill.</p>]]></paragraph>
<paragraph
    title="21.3.14.R.03."

    tags="Data Management,Encryption,Technical,Content Filtering"


><![CDATA[<p>Some systems allow encrypted content through external/boundary/perimeter controls to be decrypted at a later stage, in which case the content should be subject to all applicable content filtering controls after it has been decrypted.</p>]]></paragraph>
<paragraph
    title="21.3.14.C.01."

    tags="Data Management,Encryption,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4417"
><![CDATA[<p>Agencies SHOULD decrypt and inspect all encrypted content, traffic and data to allow content filtering.</p>]]></paragraph>
</block>
<block title="Monitoring data import and export"><paragraph
    title="21.3.15.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>To ensure the continued confidentiality and integrity of systems and data, import and export processes should be monitored and audited.</p>]]></paragraph>
<paragraph
    title="21.3.15.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Must"
    cid="4420"
><![CDATA[<p>Agencies MUST use protective marking checks to restrict the export of data from each security domain, including through a gateway.</p>]]></paragraph>
<paragraph
    title="21.3.15.C.02."

    tags="Data Management,Technical,Content Filtering"


    classification="Secret, Confidential, Top Secret"
    compliance="Must"
    cid="4421"
><![CDATA[<p>When importing data to each security domain, including through a gateway, agencies MUST audit the complete data transfer logs at least monthly.</p>]]></paragraph>
</block>
<block title="Exception Handling"><paragraph
    title="21.3.16.R.01."

    tags="Data Management,Technical,Content Filtering"


><![CDATA[<p>Legitimate reasons may exist for the transfer of data that may be identified as suspicious according to the criteria established for content filtering. &nbsp;It is important to have an accountable and auditable mechanism in place to deal with such exceptions.</p>]]></paragraph>
<paragraph
    title="21.3.16.C.01."

    tags="Data Management,Technical,Content Filtering"


    classification="All Classifications"
    compliance="Should"
    cid="4424"
><![CDATA[<p>Agencies SHOULD create an exception handling process to deal with blocked or quarantined file types that may have a valid requirement to be transferred.</p>]]></paragraph>
</block>
</subsection>
</section>
