ATS – NdCore and REGGAE

VAST 2007 Contest Submission

Authors and Affiliations:

Lynn Schwendiman, ATS, lynn.schwendiman@atsid.com
Jonathan McLean,ATS, jonathan.mclean @atsid.com
Jonathan Larson,ATS, jonathan.larson@atsid.com

Student team: [  ] YES  [ X ] NO 

Tool(s):

NdCore and REGGAE (Relationship Generating Graph Analysis Engine) are two proprietary data analysis and discovery applications developed by ATS.

 

NdCore (2003-present) is a powerful tool for integrating and analyzing large volumes of data from disparate sources. NdCore ingests discrete and/or textual data from RDBMSs and a variety of file formats (plain text, MSWord, pdf, xml) into a single repository for analysis.

 

NdCore's text analysis generates two- and three-word concepts based on the frequency, usage, and relative proximity of words within the ingested documents. The NdCore Concept Builder guides the user in defining a multi-term search phrase based on the frequency of word combination occurrences. By suggesting words as the search phrase is created, Concept Builder helps the user make intelligent decisions about which terms will help find documents of interest. The web-based user interface allows the user to quickly browse through the selected documents and to find other documents similar to them.

 

REGGAE, a prototype currently under development, provides the associative search capabilities of a graph database and the tabular capabilities of an RDBMS while also allowing multidimensional analysis. It is specifically tailored for visual analytics. Its features include:

 

·         Advanced entity relationship analytics

·         Novel two-tiered context-based graph architecture built on current commercial RDBMS

·         Data mining (Graph, Similarity Analysis, etc.)

·         OLAP

·         True multidimensional analysis and dynamic modeling

·         Relationship generation through entity aggregation

·         Foundations on relational, graph, and multidimensional databases

·         Easy integration with existing data stores

 

Data set used:   [   ] RAW DATA SET     [ X ] PRE-PROCESSED  SET

 

 

TOC:  WhoWhatWhereDebriefing - Process - Video

 


1. WHO: who are the players engaging in questionable activities in the plot(s)?   When appropriate, specify the association they are associated with

Name

Associated organization

Involved in
illegal activities? (Yes/No)

Involved in terrorist activities? (Yes/No)

Most relevant source files (5 MAX)  

Madhi Kim 

Global Ways, Wild Things

Yes

No

 

ImportPermits, 20031027_57, 20040105-1_58, 20040308_109, 20040412-2_13

Abu Hassan

Assan Circus

Yes

No

ImportPermits, 20031013_4, 20031215-1_91, 20040301-1_75

Cesar Gil

Gil Breeders

Yes

Yes

Chinchilla Dreamin’, 20030609_4, 20030901-1_36, 20040705_83, 20040705_86

Faron Gardner

Animal Justice League

Yes

Yes

20030602-1_66, 20030609_4, 20030818_23, Chinchilla Dreamin’ (textrip3)

r’Bear

Shravaana (or Shraavana)

No

No

20030609_7, 20040119-1_98, 20040308_109, 20040614_94, 20040628_61

Luella Vedric

SPOMA

No

No

20030526-2_57, 20031013_4, 20040119-1_98, 20040412-2_13

Collie (Catherine) Carnes

SPOMA

No

No

20031013_4, 20030818_23, 20030526-2_57, Chinchilla Dreamin’ (textrip3)

 


2. WHEN /WHAT:   What events occurred during this time frame that are most relevant to the plot(s)? 

 

Date
Can be a range

Event description

Most relevance source files

(5 Max)

1

1 Mar 2003 – 1 Mar 2004

Import permits issued to Global Ways for assorted wildlife, all consigned to Abu Hassan in various African countries.

ImportPermits

2

25 May 2003

Chinchillas reported as the latest pet fad in LA.

Chinchilla Dreamin’ (textrip11)

3

6 Jun 2003

Animal Justice League abducts “back room” animals from PetSmart stores in LA, promises this would not be the last revenge attack.

20030602-1_66

4

16 Jul 2003

Animal Justice League claims to have poisoned meat in 20 LA supermarkets; no poisoned meat found.

20030714-2_25

5

15 Aug 2003

Cesar Gil begins breeding chinchillas for the pet market.

Chinchilla Dreamin’ (textrip8), 20030901-1_36

6

13 Oct 2003

Collie Carnes says that Luella Vedric is helping SPOMA track the Assan Circus in Africa, to stop illegal sales of wildlife.

20031013_4

7

27 Oct 2003

Multiple complaints about Global Ways tropical fish shipments – fish dead, shipping bags covered in some noxious substance.

20031027_57

8

7 Nov 2003

Cesar Gil rails on his blog against the mistreatment of chinchillas by fad pet owners, fur industry, pet shops, and South American trappers.

Chinchilla Dreamin’ (textrip5)

9

15 Dec 2003

Letter writing campaign to stop the animal mistreatment and smuggling activities of Abu Hassan/Assan Circus.

20031215-1_91

10

6 Jan 2004

Fish & Wildlife Service issues advisory on catfish imports into Florida from South America; packaging bags contaminated. Names Global Ways a business of interest.

20040105-1_58

11

20 Jan 2004

r’Bear performs at SPOMA benefit hosted by Luella Vedric, donates $80,000.

20040119-1_98

12

2 Mar 2004

Animal Defenders International rescue team confiscates animals from Assan Circus in Zimbabwe. Abu Hassan flees the country.

20040301-1_75

13

13 Mar 2004

Madhi Kim visits r’Bear at Shravaana.

20040308_109

14

15 Apr 2004

Global Ways sponsors champagne/tropical fish auction; Luella Vedric and r’Bear are special guests of Madhi Kim.

20040412-2_13

15

2 Apr – 30 Jun 2004

Cesar Gil posts “Chinsurrection” cartoons on his blog, depicting chinchillas being infected, chinchillas multiplying, pet chinchilla making its owner sick.

Chinchilla Dreamin’ (20040402, 20040602, 20040603 jpg’s)

16

Apr – Jun 2004

r’Bear adds over 500 new animals to Shravaana, including Amur tigers, Northeast Congo lions, short-tailed chinchillas.

20040614_94

17

1 Jul 2004

r’Bear admitted to UC Medical Center with monkeypox symptoms.

20040628_61

18

7 Jul 2004

Monkeypox outbreak hits LA, pet chinchillas believed to be carriers.

20040705_83

19

7 Jul 2004

Cesar Gil posts final entry and “Chinsurrection, accomplished” cartoon on his blog.

Chinchilla Dreamin’ (textrip, 20040707.jpg)

20 max

24 Jul 2004

Two dead from monkey pox. Cesar Gil is sought in connection with the outbreak, believed to have fled the country.

20040705_86


3. WHERE: What locations are most relevant to the plot(s)?

 

Location

Description

Most relevance source files

(5 Max)

1

Los Angeles

AJL/Cesar Gil activities, monkeypox outbreak

20030602-1_66, 20030714-2_25, 20040705_83, 20040705_86, Chinchilla Dreamin’

2

Miami

Contaminated tropical fish imports

20040105-1_58, 20040412-2_13, Tropical Fish Importers, DEA Files Updatev2

3

South America

Source of chinchillas, tropical fish, cocaine

20030630_40, 20031027_57, 20040105-1_58, 20040216-5_18, DEA Files Updatev2

4

Africa

Assan Circus wildlife smuggling

ImportPermits, 20031013_4, 20031215-1_91, 20040301-1_75

5
max

San Diego

Shraavana exotic animal sanctuary

20030609_7, 20040308_109

 


4. DEBRIEFING

Chinchilla-born monkeypox

 

In June 2003, in response to PETA investigations into small animal abuses by the PetSmart chain, the Animal Justice League (AJL) broke into three PetSmart stores in the Los Angeles area and abducted “back room” animals, i.e., small animals kept from public view because of illness. Authorities sought Faron Gardner, AJL spokesperson, in connection with the break-ins. The AJL stated in its website that this would not be the last revenge attack of its kind.

 

True to its word, the AJL claimed in July in a letter to the Los Angeles Times that meat had been poisoned in 20 Los Angeles supermarkets. In a phone call to the Times, the AJL stated “We will take direct action against animal abuse in whatever form is necessary to stop the cruelty.” No poisoned meat was found in any supermarket.

 

In August 2003, biologist Cesar Gil, Faron Gardner’s friend and fellow animal rights activist, set up business as a chinchilla breeder. By September, he was selling chinchillas as pets at the West LA Farmer’s Market. This seemed an odd choice for a man who described himself as “pretty fanatical about animal rights” and who in his blog railed against the mistreatment of chinchillas by fad pet owners, the fur industry, pet shops, and South American poachers.

 

In the spring of 2004, Gil’s motives became clear. His blog’s cartoon series “Chinsurrections” depicted chinchillas being infected, infected chinchillas multiplying, and a pet chinchilla making its owner ill. His plan was to stop the chinchilla pet fad by scaring current and potential chinchilla owners with an outbreak of an infectious disease carried by chinchillas. The loss of the chinchilla pet market would put a crimp in South American poaching operations, allowing the endangered chinchillas to flourish once again. He was willing to sacrifice a few chinchillas for the greater good (“no price is too high for freedom”) and had no problem breaking the law. As he stated after the PetSmart break-ins, “If the harm to animals can be stopped, that outweighs the wrongs of breaking a law or two.”

 

Gil’s plan succeeded, at least to some extent. A monkeypox outbreak hit Los Angeles in July 2004. Pet chinchillas were believed to be the carriers. One of the monkeypox victims was megastar rapper r’Bear, who had acquired short-tailed chinchillas, presumably from Gil, in June. As of July 24, 2004, two people had died from monkeypox. Officials were seeking Gil and believed he had fled the country. Chinchilla owners were indeed frightened.

 

There are several unanswered questions in this scenario that require further investigation:

 

  • Who supplied Cesar Gil’s initial stock of chinchillas? Did Faron Gardner liberate chinchillas from the PetSmart stores and pass them on to Gil? Or was Rosalind Baptista, the Chilean chinchilla poacher, working for Gil? Did she help distribute the infected chinchillas? One of the “Chinsurrection” cartoons suggests that she did.
  • How did Cesar Gil obtain the monkeypox virus? Was it from the ailing PetSmart chinchillas? There’s a 1-year gap between the PetSmart break-ins and the monkeypox outbreak. Did Cesar Gil have the skills to keep the virus and the chinchillas alive for that period without getting the pox himself?

 

Global Ways’ involvement with the illegal wildlife trade

 

Global Ways is an international import/export business, specializing in the import of tropical fish from South America to the U.S. Its founder and CEO, Madhi Kim, is a former game warden from Kenya’s Masai Mara National Reserve. He also runs Wild Things, an animal hunting ranch in Texas which provides “exotic animals for big game hunters willing to pay thousands of dollars for the chance to bag a wildebeest or a scimitar-horned oryx.”

 

Between March 2003 and March 2004, five import permits were issued to Global Ways for wildlife consignments to Abu Hassan in Uganda, Kenya, Zambia, Tanzania, and Zimbabwe. The wildlife consignments included animals (such as tigers, lions, and elephants) covered by the Convention on International Trade in Endangered Species (CITES).

 

Abu Hassan, proprietor of the Assan Circus and the consignee named in the import permits, was suspected by animal rights organizations of engaging in animal smuggling, illegal sales, and abuse. In October 2003, Collie Carnes, spokesperson for the Society for the Prevention of Mistreatment of Animals (SPOMA), revealed that Luella Vedric, wealthy New York socialite, animal advocate, and SPOMA fund-raiser (despite her name being an anagram of “Cruella D’Evil”), was helping SPOMA track the activities of the Assan Circus.

 

In December 2003, soon after an import permit was issued to Global Ways consigning wildlife to Hassan in Tanzania, a letter-writing campaign was launched in Tanzania, urging CITES to stop Hassan’s animal mistreatment and smuggling activities. The form letter stated, in part: “Conditions at this circus are appalling. Animals are locked in cages too small for them. They trap also wild animals and sell them to overseas buyers. They smuggle chimps and parrots. THIS HAPPENS EVERYWHERE THEY PERFORM!!!”

 

The last Global Ways import permit was issued March 1, 2004, in Zimbabwe. The next day an Animal Defenders International rescue team secured a CITES confiscation order for 4 tigers, 6 lions, and a python from the Assan Circus. They also seized 10 dogs and 2 horses on welfare grounds. All of these animals were part of the import consignment. Abu Hassan had fled the country and could not be located.

 

Our theory is that Madhi Kim was using the services of Abu Hassan to stock Wild Things, his animal hunting ranch. He chose Hassan, in part, because (according to its website) CITES makes some exceptions to the general import/export principles for circuses, and perhaps would not monitor their activities as closely. The import certificates were fraudulent. Abu Hassan was not receiving these animals from Global Ways as the import data implies, he was the source of the animals. Hassan trapped (or otherwise illegally acquired) the animals as his circus travelled through sub-Saharan Africa. With the faked import certificates indicating he had obtained the animals legally, he was then able to re-export the animals to Madhi Kim via Global Ways. Export records should be examined. (Here’s a mystery: tigers, which are Asian, not African, were included in several of the import consignments. How were they obtained?)

 

Madhi Kim’s activities in the weeks following Abu Hassan’s disappearance also warrant further study. In mid-March, he met with officials from the U.S. Department of Agriculture. Is he or his ranch under investigation? About the same time he also visited Shraavana, megastar rap artist r’Bear’s exotic animal sanctuary. Then there was Kim’s “Nights of Champagne and Tropical Fish” auction in Miami in April, with special guests Luella Vedric and r’Bear. Was that just business as usual or was he liquidating his assets to raise money for his defense? In June, r’Bear announced that he had added over 500 new animals to Shravaana since April, including tigers and lions. Did Madhi Kim close down the ranch and sell his wildlife to r’Bear?

 

Global Ways’ involvement with cocaine smuggling

 

Global Ways tropical fish importing business handles about 5000 shipments per quarter. It advertises its healthy specimens, fast, reliable shipping from South America by air, and a lower risk of DOA (dead on arrival) than the competition’s. Global Ways’ clients include Luella Vedric and r’Bear. Luella has an enormous tropical fish habitat at her home, and Madhi Kim, Global Ways’ founder and CEO, personally checks up on how they’re doing.

 

In September 2003 complaints arose about Global Ways tropical fish shipments. Customers complained of DOA rates as high as 90 percent, poor packaging, and poor responsiveness. One customer received a shipment of Tigrinus catfish in which “shipping bags were covered in some noxious substance that caused our handlers hands to go numb and eventually need emergency medical treatment.” Global Ways blamed “an inexperienced packer in South America for problems in a very few shipments.”

 

The following January, the Fish and Wildlife Service (FWS) issued an advisory on catfish imports from South America into Florida, warning tropical fish merchants not to handle any shipments. Some of the packaging bags were contaminated with a toxin that caused tingling of the hands, dilated eyes, breathing difficulty, and euphoria. The advisory named Global Ways and nine other tropical fish importers as “businesses of interest.”

 

The symptoms described in the FWS advisory are all effects of cocaine inhalation or skin contact. Clearly, the fish shipments were used to smuggle cocaine into the country.

 

DEA reports describe a number of novel methods of smuggling cocaine. One report describes cocaine-impregnated silicone in baseball cap fabric. The Peruvian chemist who prepared the material had fabricated other items using the cocaine-silcone mixture, including wetsuits and suitcase liners.

 

Fish are usually shipped in a styrofoam case with an insulating lining and a plastic bag filled with water, oxygen, and the fish. The lining could have been replaced with something similar to the cocaine-impregnated materials described in the DEA report.

 

Fish are tranquilized for shipment. The DEA found chloroform to be the best solvent for extracting the cocaine from the material. Is it possible that the fish tranquilizer played a part in breaking down the insulating lining and making the cocaine evident?

 

South American drug cartels are known to make use of wildlife shipments for transporting drugs. The “inexperienced packer in South America” placed the cocaine in a few shipments. When it became clear the method wasn’t working, it was stopped.

 

Was Madhi Kim part of the drug smuggling operation? It’s possible he was completely unaware that Global Ways was being used. It’s also possible that other tropical fish importers were used as well. And Mr. Kim seemed genuinely concerned about the fish, given his attention to Luella Vedric’s collection. On the other hand, he was involved in the whole illegal wildlife trade/animal hunting ranch scenario, which was pretty unsavory. Maybe he knew. Further investigation is needed.

 


5. VISUALS and Description of ANALYTICAL PROCESS

Using NdCore to Analyze Raw Text

 

First, NdCore’s Job Builder wizard was used to load and analyze the raw text files (*.txt, *.doc, and *.pdf from the News_Text, BlobText, Support, and Support\MSDS directories). The analysis process parses the text, builds concepts, and identifies related words, stems, and sound-alike words (good for catching misspellings). It took just a few minutes to process the nearly 1500 documents.

 

Next, we used NdCore’s Concept Builder to start browsing through the data. The contest instructions pointed us toward “unexpected activities concerning wildlife law enforcement, endangered species issues, and ecoterrorism.” With that in mind, we started with a search for “endangered species”:

 

 

Figure 1. NdCore Concept Builder, initial search phrase entry

 

As shown in Figure 1, Concept Builder initially presents a list of the most common words in the data set. You can pick one or enter your own. We entered “endangered species”. The query returned 40 documents, which seemed a large number to read through, so we let Concept Builder suggest a third term to complete the concept and perhaps narrow the search:

 

 

Figure 2. NdCore Concept Builder, suggestions for next word after “endangered species”

 

Figure 2 shows Concept Builder’s suggestions for the next word. This is a list of the significant words occurring after “endangered species” throughout the document corpus. The number preceding each word indicates the number of documents in which the three-term concept occurs. We selected “CITES” because it had the most associated documents (5) and because it was unfamiliar to us. Clicking “Show Results” lists each of the associated documents with a brief fragment containing the concept searched for. We quickly learned that CITES is the Convention on International Trade in Endangered Species.

 

 

Figure 3. NdCore Concept Builder, search results for “endangered species CITES”

 

From the search results page (Figure 3), you can view any document’s gist (summary), view its full text, find documents similar to it, or download it to your local drive. We read through the full text of the five documents starting from the top, noting the names of people, places, and organizations for future searches. (NdCore has no scratch pad or sandbox-like facilities, so we simply kept notes in an MSWord document.) Two of the documents made reference to the Assan Circus, which was apparently involved in illicit activities (wild animal smuggling) that tied in well with the contest subject matter.

 

 

Figure 4. NdCore, document full text view, referencing the Assan Circus

 

Figure 4 is an example of NdCore’s document viewer, in this case displaying the third document returned from the “endangered species CITES” search, which contains a letter to CITES protesting the animal smuggling activities of Abu Hassan and his Assan Circus.  Opening its “Find Similar” tab lead us to additional documents concerning the Assan Circus and wild animal smuggling:

 

 

Figure 5. NdCore, similar documents list

 

As shown in Figure 5, four similar documents were found. The page shows the discriminating terms that were used to determine the degree of similarity (you can use these to further filter the list of documents) and indicates which documents have already been viewed. Clicking one of the “Compare” arrows shows a side-by-side comparison of the two documents with the discriminating terms highlighted:

 

 

Figure 6. NdCore, similar documents comparison

 

Figure 6 shows the comparison of our original document with the first of its similar documents – yet another document concerning the Assan Circus. Thus, after reviewing just six documents, we had a solid lead in constructing a scenario surrounding the animal smuggling activities of Abu Hassan and the Assan Circus. That’s NdCore’s greatest strength – leading the analyst to documents of interest through search term suggestions and similar document searches. Review of the other similar documents lead us to Collie Carnes, SPOMA, and Luella Vedric. Searches on those terms lead to Faron Gardner, Animal Justice League, Mr. Kim, and r’Bear. That process continued until we had a pretty good idea of what was happening.

 

We also used MSAccess to examine the import permit data (which linked Mr. Kim with Abu Hassan) and tropical fish importer data, and we took a look at the pictures.

 

Using REGGAE to Analyze Structured Data

 

While the NdCore analysis resulted in a full solution, REGGAE was also used independently to demonstrate its visualization and analytic capabilities.  REGGAE is geared primarily toward analysis of discrete, structured data but also includes its own text searching facilities, so both the VAST entity extraction data and the raw text were imported and processed.

 

To get started in REGGAE, we performed a single cell query on the entity "CESAR GIL". (We read his blog and assumed that he was involved in the plot in some way.)

 

 

Figure 7. REGGAE, single cell query

 

REGGAE found two documents and placed them on a chart. We expanded each of the document cells to see all associated structured data. After some experimentation, to limit the amount of data shown, we constrained the display to include DocumentName, StringValue (extracted entities), Article (used to access the document text), and Date (of the document).

 

 

Figure 8. REGGAE, Cesar Gil expansion

 

The chart in Figure 8 shows the documents surrounded by their extracted entities. In this view we see connections between Cesar Gil and, for example, David Chelmsworth and AJL. Some of the connections are more useful than others. “50 years ago” and “Mon Jun 06” are unlikely to yield additional information if expanded, so we concentrated on people and organizations when deciding which nodes to expand further.

 

We also used REGGAE's "Find Similar" feature to suggest new searches and to guide further expansion of the chart. For example, right-clicking the entity "AJL" and selecting "Find Similar" opened the form shown in Figure 9.

 

 

Figure 9. REGGAE, find similar results for AJL.  PETA and ELF are both linked to AJL.

 

We selected DocumentName as the type to use in the similarity search. The results show the StringValues (entities) similar to AJL in terms of links to the same DocumentNames, ranked by degree of similarity. Clicking on an entity in the Similar Results list shows the DocumentNames it has in common with AJL. The results gave us candidates for further searching. Also, as shown in Figure 9, we noticed that AJL is associated with at least two documents that were not yet on our chart. This lead us to expand AJL to add those documents to the chart.

 

 

Figure 10. REGGAE, AJL expansion

 

Figure 10 shows the results of expanding AJL and then expanding the added documents. The analysis continued in this fashion, with documents connecting to entities, and those entities in turn connecting to more documents. 

 

While the chart is useful for finding related documents through their common connections, it’s essential to be able to read the documents easily. Fortunately, REGGAE lets you access a document’s text directly from the chart.

 

 

Figure 11. REGGAE, document text, Cesar Gil wanted in connection with the monkeypox outbreak

 

We read through the text of the documents on the chart, noting the names of people, places, organizations, and other interesting events and activities for further searching. Most of these were already on the chart, as they were extracted as entities in the preprocessed VAST data. As with NdCore, we used an MSWord document to keep notes.

 

In addition to structured data analysis, REGGAE includes text searching. One of the documents we examined (Figure 11) talked about a monkeypox outbreak. A single cell query for "monkeypox" produced no results, since "monkeypox” is not an extracted entity. So we did a keyword search for "monkeypox":

 

 

Figure 12. REGGAE, text query for “monkeypox”

 

The text query had the following results:

 

 

Figure 13. REGGAE, “monkeypox” text query results

 

The form shown in Figure 13 allowed us to read the document text and export the selected documents to a chart for analysis of their related structured data, as described above. We continued this process, iteratively examining connections among structured data elements, performing text queries, and reviewing documents, until we reached the solution.

 

TOC:  WhoWhatWhereDebriefing - Process - Video