SZTAKI-graphVis
VAST 2009 Challenge
Challenge 2 - Social Network and Geospatial

Authors and Affiliations:



Zsolt Fekete, MTA SZTAKI, Hungary , zsfekete@info.ilab.sztaki.hu [PRIMARY contact]

András Lukács, MTA SZTAKI, Hungary, lukacs@info.ilab.sztaki.hu

Dóra Erdős, MTA SZTAKI, Hungary, edori@info.ilab.sztaki.hu

Tool(s):

Currently we are developing a graph visualization tool at MTA SZTAKI. We customised it for use in this challenge. The main feature we added is a tool to find certain subgraphs of a graph. With this tool we are able to define a subgraph with different conditions on the vertices. We are able to put constraints on the degrees, city size, existence of abroad links, or the person corresponding to the node being abroad. We can declare vertices to be connected or not connected. We can display on the screen either all the neighbours of a given vertex or only those neighbours that satisfy some conditions.

 

Video:

 

Link to Video

 

 

ANSWERS:


MC2.1: Which of the two social structures, A or B, most closely match the scenario you have identified in the data?

A


MC2.2:  Provide the social network structure you have identified as a tab delimitated file. It should contain the employee, one or more handler, any middle folks, and the localized leader with their international contacts. What are the Flitter names of the persons involved? Please identify only key connections (not all single links for example) as well as any other nodes related to the scenario (if any) you may have discovered that were not described in the two scenarios A and B above.

Flitter.txt


MC2.3:  Characterize the difference between your social network and the closest social structure you selected (A or B). If you include extra nodes please explain how they fit in to your scenario or analysis. 

Given the format in which the Flitter connections were provided it was an obvious thought to represent the Flitter network as a graph and try to find the criminal network in the form of a subgraph. At first we only used the connections between the members of the criminal organization described in the scenarios. We searched for a subgraph where the nodes correspond to the members involved in the organization and the edges show whether they have direct contact with each other. We wrote a simple C++ code to find these subgraphs. We thought we would only need our graph visualization tool to display the result, once we found the needed id's. But it turned out that by only looking at the links between the id's there are several hundreds of subgraphs satisfying the described scenarios. That's when we started to improve the graph visualization tool. Now with the help of the tool we can build up the subgraph step-by-step. One can add new nodes and determine with which other nodes they should or should not be connected. One can add constraints on the degrees and on the city locations. An other useful feature we developed for the tool is that it can save and load subgraph configurations. That way we can easily load a previous configuration and add or remove constraints or vertices from it. This way we started with the part of the subgraph which is common for scenario A and B. The suspected Employee has three connections in the network, his Handlers. All four persons have 30 to 40 connections in the Flitter network. Starting from this base in scenario A all the Handlers have one common Middleman, in B every Handler has his own Middleman. And at last we added the node corresponding to the Fearless Leader to both scenarios.





With this tool we discovered there is no subgraph that satisfies all the conditions given in the challenge (degrees, connectivity and geographical constraints). So we decided to take the limitations mentioned in the first part of the challenge (degrees and connectivity) very strictly while we did not insist on all the location criteria to be satisfied. So we decided to use the facts that the degrees of the Employee and the Handlers are between 30 and 40, the degree of the Fearless Leader is more than 100. We also insisted on him having an abroad connection. We were given the fact that a Middleman has no more than one or two connections outside the mentioned scenario. Thus in scenario A he has degree less than or equal to 6 while in scenario B less than or equal to 4. Considering these constraints there are no subgraphs satisfying scenario B but there are two subgraphs that satisfy A. We designed our tool in such a way that it does not only show the id of the vertices of the found subgraph but also other important information. These are the degree of the node, the city where it is located and how many abroad links it has. With the help of all this information we had to decide which of the two possible solutions for A fits the assumptions best. Based on the geographical information we decided to go with the subgraph below:



























In our scenario of the criminal network the suspected Employee would be @schaffter (id 100). His Handlers are @reitenspies (194), @kushnir (id 261) and @pettersson (id 563). Their Middleman is @good (id 4994). @good has a degree of 5 which means he has one more neighbour not shown on the graph yet. We think we should include him (id 1612 @moilanen) in the criminal network as well since the Middleman is suspected to communicate inside the criminal network only. The Fearless Leader would be @szemeredi (id 4) in Kouvnic. He has 14 abroud links. We consider these persons also suspicious since the aim of the network is to pass important information coming from the embassy abroad. This way the entire network looks like this:




MC2.4:  How is your hypothesis about the social structure in Part 1 supported by the city locations of Flovania? What part(s), if any, did the role of geographical information play in the social network of part one? 

In part 1 we could rule out scenario B but we had still two possible solutions for scenario A. The city locations helped us to choose between these to solutions. Both answers would put the suspected Employee and his Handlers in large cities as was required. Unfortunately there is no subgraph fitting the degrees constraints that would put the Fearless Leader in a large city. One of the solutions would also misplace the Middleman in the large city of Koul. That is why we chose the other solution where the suspected Fearless Leader and Middleman are both in middle cities, Kouvnic and Kannvic .


MC2.5:  In general, how are the Flitter users dispersed throughout the cities of this challenge? Which of the surrounding countries may have ties to this criminal operation?  Why might some be of more significant concern than others?

The Flitter users are dispersed uniformly throughout the cities, that is, the number of users in each city is proportional to its size. On the other hand most of the suspected criminal network is concentrated in the city of Prounov which might indicate that this city is the center of operation of the network. To analyize the connections of this operation to the surrounding countries we have looked at the foreign contacs of the suspected Fearless Leader. We have found a total of 14 foreign contacts out of which 7 are in Otello, 4 in Tulamuk and 3 in Transpasko. This information would suggest that Posana is the most likely to be have ties to the organization. On the other hand if we take a closer look at the 14 people, only one of them has sufficiently many contacts to be really suspicious, namely @tolbert from Tulamuk. This might indicate that Trium should be of more significant concern.