VSTI/SAS - Terrorism News Analysis

VAST 2011 Challenge
Mini-Challenge 3 - Investigation into Terrorist Activity

Authors and Affiliations:

Edward Swing, VSTI, a SAS Company: ed.swing@vsticorp.com [PRIMARY contact]

Kevin Boone, VSTI, a SAS Company

Tool(s):

For this challenge, we used a combination of several tools, centered around the Luminary System, a prototype system in development at VSTI that Luminary extracts semantic concepts and entities from information sources, performs semantic inference, and then pushes the results into a semantic wiki for users to browse and explore. We developed several enhancements to Luminary during this challenge.

Luminary uses modular plugins to incorporate multiple entity extractors, including Alchemy, Calais, Lingpipe, and OpenNLP. After entity extraction, Luminary passes the entities and entity types to a set of Entity Verifiers for verification and entity normalization. Next, Luminary performs Semantic Concept Extraction, attempting to identify particular concepts within each text document. Finally, the resulting entities, augmented articles, and semantic concepts are loaded into the semantic wiki.

Note: Lack of internal consistency within the data, such as missing locations and incomplete name replacements, hampered the semantic concept extraction process.

To focus the processing of articles, Luminary used SAS Content Categorization Server to identify the general topic of each news article. The extraction process focused on those topics (Crime, Social Issues) which had the highest probability to be a factor in terrorism. Others, such as TV schedules or sports scores, were ignored.

The Semantic MediaWiki software is a set of extensions available for MediaWiki (the software used in Wikipedia). They include visualization capabilities for social networks, timelines, maps, and graphs. The wiki generates visual displays at rendering time, and provides notifications and a collaborative environment suitable for analysis.

Video:

Video

ANSWERS:


MC3.1: Potential Threats: Identify any imminent terrorist threats in the Vastopolis metropolitan area. Provide detailed information on the threat or threats (e.g. who, what, where, when, and how) so that officials can conduct counterintelligence activities. Also, provide a list of the evidential documents supporting your answer.

We used Luminary to extract entities and concepts, and insert the information into the wiki. Pages were created for each entity and news article. Luminary also created referential pages, redirects for disambiguation, forms, categories and templates within the wiki. Unfortunately, event extraction was stymied due to inconsistencies in the data.

We enhanced Luminary to account for particular irregularities found within the dataset. Development time was approximately 120 hours, while the extraction and ingestion process took about 6 hours. Visual analysis of the articles using the semantic wiki took approximately 12 hours.


Figure 1: Wiki page for a news article, showing extracted entities and topic.


Figure 2: Wiki page for a politician, showing articles referencing her and complex social network.

Once Luminary ingested, enhanced, and uploaded the articles, searching and browsing the information was simple. Initial queries on simple terms, such as "terror", yielded lists of articles where the term appeared. As we discovered new articles and information, additional searches provided further evidence and potential leads.


Figure 3: Wiki listing of articles mentioning terrorism. Browsing this list allows simple exploration of information.

Another approach involved checking articles under certain topics. Under the Crime Law and Justice topic, several articles appeared to be significant to Vastpolis security. We found several leads and events that suggest threats to Vastopolis. We identified and then ignored threats that occurred outside of Vastopolis, or that involved white collar crime such as the money laundering circle involving Mayor Lark.


Figure 4: Wiki listing of articles in the Crime Law and Justice topic. This list is automatically generated from the semantic values within the wiki.


Figure 5: Wiki Page for Network of Hate, showing articles, timeline of activities and social network.

Summary of Terrorist Groups and Threats