Entry Name:  "KULEUVEN-Sakai-GC"

VAST 2015 Challenge
Grand Challenge

 

 

Team Members:

Ryo Sakai, KU Leuven, ryo.sakai@esat.kuleuven.be     PRIMARY
Daniel Alcaide, KU Leuven, daniel.alcaide@esat.kuleuven.be

Jan Aerts, KU Leuven, jan.aerts@esat.kuleuven.be

Student Team:  YES

 

Analytic Tools Used:

R (ggplot2, dplyr, igraph, tidyr, RColorBrewer, lubridate, vegan)

Processing, to prototype

 

Approximately how many hours were spent working on this submission in total?

120

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete? YES

 

 

Video Download

Video:

ftp://ftp.esat.kuleuven.be/pub/stadius/rsakai/VAST_2015/kuleuven-sakai-gc-video.mp4  (32MB)

ftp://ftp.esat.kuleuven.be/pub/stadius/rsakai/VAST_2015/kuleuven-sakai-gc-video.wmv (142MB)

 

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

 

For each of the following questions, consider both the movement and communications data.

GC.1Scott is not a paying customer and does not have an ID. Describe Scott Jones’ activities in the park during the three-day weekend. Who does he spend most of his time with? When does he arrive? When does he leave? What route does he follow?

Limit your response to no more than 10 images and 1000 words.

Based on the information that Jones was scheduled to appear in two stage shows, we subset individuals who appear at the Grinosaurus Stage and visualize their presence in the Sequence View (Figure 1.1).  Each individual is represented as a horizontal line, and the length of the line shows the inferred duration of time at this location. The color encodes whether it involved check-in or not. In Figure 1.1, we find 8 individuals who come back regularly and stays without check-in, referred to as “subset1”. We suspect subset1 is securities of the park, and Jones could be with these individuals (hypothesis1). Another hypothesis (hypothesis2) is that Jones could be with those who check in twice or stay longer than the normal. For example, we find 4 individuals who stay for 6.2 hours on Friday (subset2), and 3 individuals who stay longer after the afternoon show on Saturday (subset3). Because subset2 and subset3 stays at the stage, probably with Jones, they could be park patrons with some special privileges.

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:GC:stage_attendance.pdf

Figure 1.1. Sequence view of movements at the Grinosaurus Stage.

The movement pattern of the subset1 (521750, 644885, 1080969, 1600469, 1629516, 1781070, 1787551, 1935406) appears to be very consistent and synchronized (Figure 1.2). They go back and forth between the East Entrance and the Grinosaurus Stage.

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:GC:security.pdf

Figure 1.2. Movement pattern of subset1. The arrow indicates the direction of movement, and a blue circle represents a check-in.

As mentioned in MC1, as we compare the trajectory paths of subset1, we find an anomaly in the movement of 1080969 in the morning on Sunday (Figure 1.3) otherwise subset1 appears to move very regularly on between the East Entrance and the Grinosaurus Stage.

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:MC1:trajectory_security.pdf

Figure 1.3. Trajectory paths of subset1.

Figure 1.4 shows the GPS trails of subset2 and subset3. Colored circles indicate check-in records, and the color encodes the time of the day. Extending the hypothesis2, Jones could be with subset2 on Friday, but on Saturday and Sunday, only 1690685 visits the park and he/she does not go to the stage on the weekends.  Jones could be with subset3 on Saturday, but subset3 only visits for the afternoon show.

Another hypothesis (hypothesis3), which combines some insights from hypothesis 1 and 2, is that Jones is with subset1 generally, however on Friday, he stays at the stage with subset2 between two scheduled shows. Jones then leaves with subset1 after the second show via the East Entrance. On Saturday, Jones comes to the stage with subset1, and after the afternoon show he stays with subset3 and leaves the park with them from the East Entrance. If Jones is with subset1, it is not clear why they would not take the shortest route from the East Entrance to the Grinosaurus Stage.

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:GC:subset23.pdf

Figure 1.4. GPS trails and check-in patterns of subset2 and subset3.

 

 

 

GC.2 – Identify up to 8 issues with park operations during the three-day weekend.  Provide a rationale for your answers.

Limit your response to no more than 8 images and 800 words.

 

One issue is the waiting time for some attractions. From the GPS data, we infer how much time a visitor spends after a check-in or after arrival, and overlay histograms to compare distribution of time spent at each attraction (Figure 2.1). Some Thrill Rides, including Firefall, Flight of the Swingodon, TerrorSaur, and Wendisaurus Chase, show a shift in distribution. For these attractions, the waiting time on Saturday and Sunday is much longer than that of Friday.

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:MC1:duration_histogram.pdf

Figure 2.1.  Histograms of time spent at each attraction over three days.

We compare the distribution of waiting time by the hour of the day drawing violin plots (Figure 2.2). The width of each violin is scaled to the maximum count per bin. The plot shows that the waiting time for these rides is about 60 minutes on Saturday and Sunday between 11:00 and 19:00, while it is only around 10 minutes on Friday, except for the TerrorSaur, which waiting time appears to peak around 15:00.

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:GC:Violinplot_busy_attraction.pdf

Figure 2.2. Violin plots of waiting time for selected thrill rides.

 

We compare the distributions of the length of time spent at entrance by each visitor and their frequency by drawing the histogram with stacked bars (Figure 2.3). It shows that visitors who arrive closer to 9 on Saturday and Sunday have to wait longer at the North Entrance. The operation at the entrance could be improved by anticipating large volume of arrivals just before 9.

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:GC:entrance_waiting.pdf

Figure 2.3. Histograms of visitor counts per entrance over time.  The color of bars represents the waiting time.

 

Perhaps one of the critical issues of the park operation was the fact that the Pavilion was not completely cleared between shows on Sunday.  Figure 2.4 is a sequence view of all the visitors who visit the Creighton Pavilion and the color encodes whether the visit was associated with check-in or not.  As the arrows indicate, there is a group of 37 visitors who appears in the location without check-in (group 1) and a group of 3 visitors who appear to have stayed after the morning show (group 2).

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:MC1:pavilion_movement.pdf

Figure 2.4. Sequence view at the Pavilion on Sunday.

 

Although it is perhaps a minor operation issue with GPS tracking or recording system, the GPS trails of 1983765 shows very odd patterns at 15:00 and 20:00 on Saturday, where it looks as if the GPS records represent two separate individuals (Figure 2.5). In the plot, the gps records are ordered based on the timestamp and lines are drawn connecting locations from two consecutive records.

 

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:GC:odd_one.pdf

Figure 2.5. GPS trails of  1983765 on Saturday.

 

 

 

 

GC.3 – For the crime, describe the following, and provide your rationale:

a.      When did the crime occur?

b.     Where did the crime take place?

c.      Who are the most likely suspects in the crime?

Limit your response to no more than 5 images and 500 words.

 

We suspect that the crime occurred between 10:00 and 11:30 on Sunday at the Creighton Pavilion for a few reasons.

·       The news brief mentions the vandalism at the pavilion.

·       The partial closure of the park is observed after 12:00 in the check-in count. T

·       The communication count peaks in the Wetland around 12 on Sunday, as shown in the histogram of communication counts with stacked bars to characterize and highlight some sender or receiver of the communication (Figure 3.1). 1278894, 839736, and external are high volume communications and subgraph 1 and subgraph 2 are sets of individuals derived from studying the communication network of those who contact the external between 11:45 and 12:00 (further details are in MC2). In fact, the “subgraph1” corresponds to the group 1 in Figure 2.4.

We hypothesize that the group 1 arrives at 10 and vandalize and they send messages among themselves around 11:30 as the pavilion opens for the mid day show.  The peaks of communication to the external between 11:45 and 12:00 are followed by the peaks of communication to 839736. In MC2 entry, we saw that the large volume of communication originates from the vicinity of the pavilion.

 

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:combined_histogram.pdf

Figure 3.1. Histogram of communication counts from the Wetland on Sunday.

 

Besides the suspicions group1 and group2 pointed out in Figure 2.4, we find the movement of 1080969 on Sunday odd, as shown on Figure 1.3. Upon closer inspection of his/her GPS trail by binning into 15-minute interval (Figure 3.2), the GPS trails are missing between 2014-06-08 09:15:17 at position (18, 39) and 2014-06-08 09:19:40 at position (28, 16). The missing GPS trails for 4 minutes near the Pavilion (highlighted in red in Figure 3.2) is suspicious and this is prior to the crime we suspect and his/her involvement should be further questioned.

 

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:GC:suspicious_security.pdf

Figure 3.2. GPS trails of 1080969 between 8:45 and 12:20 on Sunday.

 

Additionally, as we examine Figure 2.4 closely, we identified those who appear to be in or at the Pavilion repeatedly without check-in, and some individuals during the suspected crime (Figure3.3). The roles of these individuals are unclear, but different IDs with similar visiting patterns are also observed on Friday and Saturday. One hypothesis is that they are workers at the park. Among these individuals, we find 159893, 430595, and 1711922 more suspicions since they appear to be at the coordinate during the suspected crime time (colored in a darker shade of gray).

 

 

Description: Macintosh HD:Users:Ryo:Desktop:PhD:Year_4:22_VAST_2015:GC:repeater_sunday.pdf

Figure 3.3. Sequence view of those who come to the pavilion repeatedly on Sunday.  The Y position is comparable to Figure 2.4.