From
Oculus Info Inc.:
Casey
Canfield, ccanfield@oculusinfo.com PRIMARY
Daniel Cheng, dcheng@oculusinfo.com
David Gauldie, dgauldie@oculusinfo.com
David Jonker, djonker@oculusinfo.com
Scott Langevin, slangevin@oculusinfo.com Team Lead
Peter Schretlen, pschretlen@oculusinfo.com
Chris Wu, cwu@oculusinfo.com
Student
Team: NO
Aperture, developed by Oculus Info Inc.
Our team
used Aperture to rapidly develop a tailored visualization solution for this
challenge.
Aperture is
an open and extensible Web 2.0 visualization framework, designed for analysts
and developers to use in any common web browser. Aperture utilizes a novel
layer-based approach to visualization assembly and a data mapping API that
simplifies the process of adaptable transformation of data and analytic results
into visual forms and properties. This common visual layer and data mapping
API, combined with core elements such as contextually derivable color palettes,
layout, and symbol ontology services, is designed to enable highly creative and
expressive visual analytics, rapidly and with less effort.
We
designed a tailored situation awareness and analysis application showing
thumbnail time series charts of policy events, performance events, and derived questionable
activity issues. These thumbnails are arranged according to the hierarchical
organization of the Bank of Money. In a single view, we display the corporate
headquarters (CHQ), the large data centers (DC1-5) and all of the large and
small regions. The charts are generated based on normalized counts of events as
a percentage of the total number of machines in the region, with the base of
each time scale proportional to the number of machines. Regions are sorted by
time zone, and shaded areas in each thumbnail indicate periods outside of
business hours for that area. Interactions include drilling down to view
detailed event counts and machine type distributions for each region or center.
The legend also allows event filtering by type, and also by subtype of
“Questionable Activity.” The application also contains an
interactive map in the upper left to allow analysis of geo-based trends.
Video:
MC1 - Interactive Situation Awareness Video
Answers to Mini-Challenge 1 Questions:
MC 1.1 Create a visualization of the health and policy status of the entire Bank of Money enterprise as of 2 pm BMT (BankWorld Mean Time) on February 2. What areas of concern do you observe?
First, our visualization revealed a severe policy issue in Data Center 2
(DC2). Our tool displays a red color in the policy status bar that intensifies
for more severe violations, signaling a major area of concern. In the detailed
data for DC2, we observed a level 5 policy deviation on IP address 172.2.194.20.
Figure A: Potential virus infection in DC2.
Second, by visualizing
regional aggregates of policy status, and then comparing across all regions,
our tool showed a pattern of small numbers of severe policy deviations across
most regions. Interactive scaling of red policy status bar charts suggested an
upward trend over a three hour period. We opened detailed table views for each
affected region, and repeatedly observed small numbers of severe policy
deviations. We confirmed the upward trend in both the number and severity of
deviations.
Figure B: An alarming number of severe policy deviations.
Third, we observed a pattern of after-hours maintenance in regions that
were outside of business hours as of 2 PM BMT on February 2nd. This
pattern included workstations (which should be powered down) with multiple
connections. Since our visualization incorporates shading to indicate operating
hours for a given region, we determined which regions had significant activity
outside of normal business hours. After opening the details for these regions
in our comparison view, we saw a consistent pattern of after-hours maintenance
of workstations. The pattern was small but steady across all affected regions.
Our analytic displays only events that involve an abnormal activity code
(greater than 1), alerting us to the suspicious pattern.
We also observed a pattern of similar activity with ATMs. However, unlike
workstations, ATMs are not normally assumed to be powered down after business
hours.
Figure C: A consistent pattern of after-hours maintenance (with abnormal
activity codes) for one region.
MC 1.2. Use your visualization tools to look at how the network’s status changes over time. Highlight up to five potential anomalies in the network and provide a visualization of each. When did each anomaly begin and end? What might be an explanation of each anomaly?
Anomaly
1 – Policy Status Degradation Over Time
We adjusted the visualization to show activity over the entire 48 hour
span, and then filtered this view to show only policy status. This revealed a
clear trend of escalation in policy deviations, in both severity and number. By
inspecting this global view and increasing the chart scale, we could narrow
down the start time of the anomaly by observing the first occurrence of bright
red “critical policy deviation” bars in the data. Since the
visualization indicated that policy violations were reported shortly after the
beginning of the data set, we narrowed the scope of visible data to concentrate
on the initial activity on February 2nd. The first
“serious” policy violations occur between 8:15 AM and 8:30 AM BMT
on February 2nd, in multiple regions. The first “severe”
policy violation happened between 9:15 AM and 9:30 AM BMT on February 2nd,
at Corporate Headquarters (CHQ) and at DC2. We opened a detailed comparison of
CHQ and DC2, and discovered that the events at CHQ occurred on office
workstations, and the events at DC2 happened on computational servers.
From our table view, we drilled down into the list of IP addresses
affected by the notifications. We were able to observe that a machine in DC2 (172.2.194.20) that
triggered a serious policy deviation at 8:15-8:30 also triggered a critical
policy deviation at 9:15-9:30. We also discovered the same pattern of matching
IP addresses elsewhere, particularly in a CHQ workstation (172.1.56.176).
Figure 1-1: Examining the detailed data for the policy violation anomaly
in the CHQ workstation.
Broadening the scope of visible data in the global visualization revealed
that other locations began reporting critical policy deviations, and eventually
virus notifications, in gradually increasing amounts. These alerts became
numerous and pervasive throughout the network at the conclusion of the
available data.
This pattern of activity could be caused by malware designed to increase
security vulnerabilities on infected machines. This eventually leads to
widespread critical policy violations, and a rampant virus infection throughout
the enterprise.
Figure 1-2: Observing the upward trend in policy violations across the
enterprise.
Anomaly
2 – Types of Activity Absent in Regions, 6, 8,
9, and 46
A
48-hour regional view, filtered to chart only green “Questionable
Activity,” showed gaps in Regions 8 and 9. Opening detailed table views
showed these regions never reported external device activity, while other large
regions reported thousands of instances.
Using
the same view, but filtered for Performance Issues, we determined that reports
of fully consumed CPU were absent from Regions 6 and 46.
Because
the affected regions reported other types of activity, the anomaly may indicate
there were compromised reporting services returning false status.
Figure
2-1: Observed gaps in device activity reporting in Regions 8 and 9.
Figure
2-2: Observed gaps in fully consumed CPU reporting in Regions 6 and 46.
Anomaly 3
– Off-Hours Maintenance Activity
Our
tool allowed us to see this activity in the 2 PM BMT snapshot of network
“health.” We were also able to broaden the scope of our
visualization to better reveal this trend of activity over time. We switched to
a 48-hour view, and were able to see that after-hours maintenance continued
over the entire two-day reporting period. Opening the event details table for a
region shows that the maintenance happened on only a few machines at a time. We
confirmed that throughout the entire reporting period, the pattern still
consistently occurred on all machine types.
Again,
our analytic reports only those after-hours connections that have an activity
code of greater than 1, signaling an “abnormal” condition. Because
of this, we immediately regarded the activity as suspicious.
Since
our analytics did not measure suspicious after-hours activity for servers, we
are unable to determine if servers also experience similar behavior patterns
after business hours. Our analytics also do not measure suspicious activity for
workstations during office hours, so we were unable to determine if abnormal
increases in activity also occurred on these machines during those times.
By
clicking on each number in the detailed event chart, we were able to see that
several IP addresses associated with after-hours activity were also issuing
maintenance downtime notifications. Since the Bank of Money does not have
scheduled maintenance intervals, it is very possible that some of their
maintenance work occurs after-hours, making this a normal pattern of activity.
However, since there is not an exact correlation between IP addresses, and
since the “green” events have abnormal activity codes, it is also
possible that some of the activity is not related to normal maintenance, or
some machines going down for maintenance have fewer than two connections.
Figure
3-1: After-hours activity occurs over the entire reporting period.
Anomaly 4
– Activity Trends in DC5 and Region 25, on February 2nd
Rapidly
comparing 24-hour views of February 2nd and 3rd (using
the forward/back buttons in the browser, with policy status turned off)
revealed two differences. On February 3rd, activity is uniform
across all regions and data centers. However, on February 2nd
, during business hours, Region 25 activity decreases, while DC5
activity increases. This was confirmed by clicking on each chart to display
tables of detailed event counts.
The
anomaly could indicate a facility-wide problem being address in DC5, while a
region-wide problem may be impacting Region 25.
Figure
4-1: Activity anomaly revealed through rapid visual comparison of charts.