VAST 2008 Challenge
Mini Challenge 3: Cell Phone Calls
Authors and Affiliations:
Jason
Dalton, SPADAC, jason.dalton@spadac.com
Chris
Elsaesser, SPADAC
Steve Touw,
SPADAC
Student team: NO
Tool(s): ORA and Allegro Graph were used
for assessing social networks inherent in the cell phone data. ORA is provided
by the CASOS project at Carnegie Mellon University. ORA is a social network
analysis system and was used to compute eigenvector centrality. Allegro Graph
is a semantic web graph database management and analysis system produced by of
Franz, Inc. Allegro Graph was used to produce social network visualizations as
well as temporal and geospatial assessments.
Two-Page Summary: NO
ANSWERS:
Phone-1: What is the Catalano/Vidro social network, as reflected in the
cell phone call data, at the end of the time period?
Phone-2: Characterize the changes in the Catalano/Vidro social
structure over the ten day period.
Detailed Answer:
A good
understanding of Ferdinando Catalano’s (FC) social network was gained by using
a variety of network analytics. This report outlines our problem solving
process. Catalano’s social network at the end of the 10 day period is presented
at the conclusion of the report.
Initial Social Network
Identification
Beginning
with the intelligence provided that Person-200 was likely to be FC, we
generated a link chart of Person-200’s social network based on calls he made
and received. That network, presented as Figure 1, is quite small, which is
consistent with observations of leadership cells in grass-roots insurgencies.
Figure 1: Social Network of
Person-200 (suspected Ferdinando Catalano)
We next
computed call volume between Person-200 and the others in this network, as well
as the eigenvector centrality (aka “authority”) and connects-group ranks for
each of these persons. That information is presented in Figure 2. Combining
this information with the intelligence provided, we suspect the following
identities:
·
Person-1 has
the top ranking in Eigenvector Centrality as well as 4th in Connects
Groups. Consequently, we believe Person-1 to be David Vidro, who coordinates
high level Paraiso activities.
·
Person-2 and
Person-3 are likely Juan Vidro and Jorge Vidro, but we are uncertain which is
which;
·
Person-5 is
likely to be Estaban Catalano (EC) based on the intelligence that FC most
frequently calls his brother.
Network Change Detection
The main
question for this challenge problem is to characterize FC’s social network at
the end of the 10 day time period. Notice in Figure 2 that the key individuals
made or received no calls from Person-200 after 6/7/2006. Yet handset 200 remained active, albeit at a
much lower frequency, until 8:18 p.m. on 6/9/2006 and the other key handsets
were in use all 10 days. One explanation for this change in communication
patterns is that the organization’s leadership started transitioning to
different handsets on or about 6/8/2006.
Figure 2: Call frequency with
Person-200 and his network
Our method
for detecting handset changes is first to plot eigenvector centrality of each of
the key handsets. That plot, shown in Figure 3, indicates a simultaneous drop
in Eigenvector Centrality on 6/8/2006. This led us to conclude that is likely
all five individuals acquired new phones the evening of 6/7/2006 or early
morning of 6/8/2006. Their new identifiers would likely be among those that had
an Eigenvector Centrality spike on 6/8/2006.
These were identified by finding which identifiers had a high standard
deviation in their Eigenvector Centrality scores across the 10 days. The original identifiers (1, 2, 3, 5 and 200)
had high standard deviation, as did five other handset identifiers: 300, 306,
309, 360, 397. Figure 4 shows spikes in Eigenvector Centrality for these
handsets beginning 6/8/2006.
To determine
who is who of the new identifiers, two techniques were used. First, we compare
Eigenvector Centrality. This was especially useful in associating Person-1 with
Person-309, since both have the top Eigenvector Centrality scores. Second,
spatial analysis was used to compare which cell towers the identifiers were
spending most of their time and find similarities between the old identifiers
and the new identifiers. Per day average
was used for the comparison since they were not spread across the same amount
of days in the data. By comparing the
per day averages the old and new identifiers could easily be associated to one
another.
Figure 3: Eigenvector Centrality
by day for initially identified handsets.
Figure 4: Eigenvector centrality
by day for handsets subsequently used by leadership.
Using this
information, we plotted the one-hop social network of Person-300 who we think is
FC who we previously identified as Person-200. That network, which is the
social network of FC at the end of the ten day period, is depicted in Figure 5.
Notice the similarity to the network in Figure 1; with the exception of a few
unidentified persons, the networks are nearly identical in structure.
Figure 5: Catalano social network
at end of the period.
Summary and Conclusions
Using social
network analysis, visual analytics, and geospatial correlation, we conclude
with moderate confidence that the Catalano network is as depicted in Figure 5.
There is, of course some uncertainty due to the initial intelligence and
anomalies in the data.