Publication Type:Journal Article
Source:Journal of Defence & Security Technologies, Volume 4, Issue 1, Number 1, p.1-11 (2022)
Keywords:Classification, dataset labelling, machine learning, moderation, operational data exploitation, sanitization, sensitive media, trauma risk mitigation, traumatic risk mitigation, user interaction.
To develop new computer vision capabilities leveraging artificial intelligence, we will increasingly need to use operationally realistic training and validation datasets. Although operational full motion video and imagery datasets present information regarding their provenance and classification level, these designations are often not indicative of the presence of potentially offensive or traumatizing content. As machine learning and data scientists increasingly need to work with operational unsanitized operational video and imagery data, they will have a higher risk of being exposed to sensitive and traumatic content. In this paper, we first raise awareness about thisrisk within the Defense community. Then, we propose several approaches for mitigating machine learning practitioner's exposure to offensive and traumatizing media, including dataset preprocessing procedures and viewing tool design considerations.