Entry Name:  "ITBA-Fontanella-MC2"

VAST Challenge 2015
Mini-Challenge 2



Team Members:


Fernando Bejarano González, Instituto Tecnológico de Buenos Aires, fbejaran@itba.edu.ar
Teresa Natalia Fontanella De Santis, Instituto Tecnológico de Buenos Aires, tfontane@itba.edu.ar     PRIMARY


Student Team:  YES

 

Did you use data from both mini-challenges?  YES

 

Analytic Tools Used:

Tableau (most of graphics)

Postgresql (DB Queries).

Gephi 0.8.2-beta (for "Net graph")

 

Approximately how many hours were spent working on this submission in total?

45 hours.

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete? YES

 

 

 

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

 

MC2.1Identify those IDs that stand out for their large volumes of communication.  For each of these IDs

 

      a.      Characterize the communication patterns you see.

      b.      Based on these patterns, what do you hypothesize about these IDs?

 

Limit your response to no more than 4 images and 300 words.

 

 

Part A

The following graph shows the total number of messages shown sent / received by the 20 largest volume IDs, and from where they were sent. It can be seen that the highest volume ID messages (either sending or receiving) are: 1278894 and 839736.

Worth noting that much of the message flow (discarding these two IDs) is given in the "Wet Land" (where the entrance to " Creighton Pavilion" is); and the second largest recipient of messages are outsiders. Besides the two IDs do not appear in the position data, and nor communicate with each other.

ID 1278894 Analysis

The following graph shows the number of messages sent by this ID during the three-day show. Send a message every five minutes from 12am until 20:55hs inclusive, except for the odd hours (13hs, 15hs, etc). Shipping patterns among the three days is very similar, except Sunday during which not decrease the number of messages sent between 15hs and 16 hours. Moreover, it is remarkable how many messages between Friday and Sunday are increased.

ID 839736 Analysis

The following graph shows the number of messages sent by this ID during the three-day show . This ID steadily sends messages (and in more than one case, multiple). As with 1278894, Sunday traffic is higher, but there is a very high (and suspect) peak between 11 am and 12 pm. Much of the messages it sends and receives comes ids: 38945, 951112, 159893, 18452. These IDs will write to ID 839736 and this responds each message.


The following chart shows, for each of the two IDs analyzed, users who exchanged messages indicating red if on average more responsive, or green if sent messages. The intensity of the color indicates the response time. From this it comes the 839736 ID usually respond to messages; 1278894 unlike usually send (and receive response).

Part B

From the above, the ID 839736 possibly an automatic control of the park located in "Entry Corridor" that receives information from various sensors and informs them that received the data correctly. On the other hand, id 1278894 is possibly an automatic located in the "Entry Corridor" that periodically send notifications to park visitors.

 

 

MC2.2Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime.

 

Limit your response to no more than 10 images and 1000 words.

 

 

 

Pattern 1



For the following analysis, the sent messages are avoided from "Entry Corridor" because from this place you send a large number of messages for automatic systems (as detailed in MC2.1). The volume of messages during the three days increases between 8am and 12am. Then it remained practically stable until 19:00hs approximately, at which time begins to decrease until the closing of the park (approximately 23:15hs). This indicates that the moment of greatest assistance to it is between 12am and 18:00hs.

Pattern 2

According to the data provided in the event of the soccer player, he makes two daily presentations on the "Creighton Pavilion", in "Coaster Alley" section park. When analyzing those daily peaks of messages sent across the park and peaks messages sent from the "Coaster Alley", we notice these patterns:

*The three-day promptly at 11am there is a maximum

*On Friday and Saturday there is a maximum at 16hs promptly

*On Sunday there isn't a peak at 16hs as the previous days

This indicates that the two events in which Scott is presented likely occur at 11am and 16 pm. It could also indicate the event is not held 16hs on Sunday.

Pattern 3

The following graph shows the total number of messages shown sent to external IDs to the park for the three days grouped by time. We see there is an extraordinary increase the amount of these messages during the Sunday around 11am.




With the next chart we can point out that the time range in which the increase comes out of the normal occurs precisely between 11:45hs and 12:00hs.



The following graph shows the amount of messages sent to external IDs from different park sectors by the hour during Sunday. In this chart, we note that the extreme increase of messages sent to external IDs occurs from "Wet land" (where is the entrance to the exhibition).



Pattern 4

The following is a "net graph". In the same account as nodes and edges IDs certain relationships are represented these nodes, according to the messages sent between them. It was done with "Gephi 0.8.2 - beta" software tool, using "Force Atlas2" clustering algorithm. The IDs that communicate with external during the 11:45hs - 12hs on Sunday are represented. You can appreciate four Groups (in red, yellow, violet and blue). The node in the center represents external IDs. Nodes colored with low saturation represent those IDs that during this period do not send messages to external IDs, but that communicate with IDs that do.


Pattern 5

The entrance to Exhibition Place Soccer Player ("Creighton Pavilion") is in the sector "Wet land" and the rest "Coaster Alley". Whether analyzes the total number of messages from these sectors around 11am on Sunday, there is a significant increase of messages sent from "Wet land" between 11:30hs to 11:45hs. At 11:45hs this amount begins to fall and are simultaneously greatly increases the number of messages sent to external IDs from the same sector. This could indicate that 11:30hs Park visitors noticed the vandalism (as indicated by the news included from the files of the problem) and fifteen minutes later the park authorities learned it. After that, they closed the sector which explains why decreases both the total number of messages sent from the field yet it remains high the Total quantity of messages IDs external (possibly are communicating security staff police). Then at 12am reached again a peak of messages sent from the sector; this could indicate that visitors are communicating with acquaintances in other parts of the park to leave the "Creighton Pavilion".


Pattern 6

Among the three days 62067 messages are sent to external IDs. But no external ID sends a message ID within the park. Perhaps the message system is not entitled to send messages from outside the park.

Pattern 7

The external IDs to send messages during the three days at the time of exposure and from the sector "Wet Land" are: 473436, 286210, 813636, 1392457, 1229505, 1860592, 871333, 1388440, 1901189, 1216041, 1970360, 1217112, 2090883.

Probably, these IDs would be part of the Park security personnel for the following reasons:

The messaging patterns were analyzed between these IDs. Two groups were detected, based on communications. The first consists of 871333, 1217112 and 2090883 IDs. The last two are sending and answering messages for the duration of the show. In this group, the ID 871333 is possibly the chief of the other two, and he writes them both simultaneously from time to time but they do not write or respond him.

The second group consists of 286210, 918738 and 1970360. The ID 918738 is possibly the leader of this group.

These potential employers are not always in the field of the event.

In the following charts, the message exchanges between these IDs are shown during the three days in the area "Wet Land" during times the show. In the first, 11am show exchanges are displayed, and in the second, the 16hs show exchanges.







As shown in the first chart, the first group began an intense exchange of messages at 11:37hs until 12am the day Sunday. This situation occurs only on Sunday; this would indicate that possibly these security staff have been the first to know after visitors. Moreover, as shown in the second chart, during the time the second show of the day Sunday there is poor communication between security personnel guarding the "Creighton Pavilion" compared to the other days.

 

MC2.3From this data, can you hypothesize when the crime was discovered?  Describe your rationale.

 

Limit your response to no more than 3 images and 300 words. 

 

 

The second presentation of Sunday was not performed because there wasn't the usual rush of messages during 16hs and because communications between staff safety precautions is much lower than usual in the field of event. Therefore we are sure that vandalism occurred before 16:00 pm Sunday . As discussed above and in the graphics we conclude that the realization of vandalism was discovered by visitors on Sunday around 11:30 pm. Then at 11:37 the authorities find out and finally at 11: 45hs begin to close the "Creighton Pavilion" as well as send messages to external people we assume to be the police. In the sending and receiving of messages ID 839736 We also realize that at 11:45 a peak message is generated.