Posts Tagged ‘cloud computing’

Google Analytics Update

Wednesday, August 29th, 2012

Last year I wrote about taking apart a MySpace cookie.  Included in that posting was some discussion on Google analytics tools found within the cookie.  It was interesting and I got some good feedback about the blog entry.  I was contacted by Jim Meyer of the DoD Cyber Crime Center about some further research they had done on the Google analytics within cookies and a presentation they were preparing at the time for the 2012 DoD Cybercrime conference (if you saw the presentation at DoD let me know how it went).

They were able to determine more information about the specific pieces of the Google analytics cookie placed on a user’s computer when they go to a webpage that contains Google Analytics.

The Google Analytics Cookie collects stores and reports certain information about a user’s contact with a webpage that has the embedded Google analytics java code. This includes:

  • Data that can determine if a user is a new or returning user
  • When that user last visited the website
  • How long the user stayed on the website
  • How often the user comes to the site, and
  • Whether the user came directly to the website,
    •  Whether the user was referred to the site via another link
    • Or, whether the user located the site through the use of keywords.

Jim Meyer and his team used Googles open source code page to help define several pieces of the code and what exactly it was doing when downloaded. Here is some of what they were able to determine (The examples are the ones I used in my last posting with a little more explanation about what everything means. I explained how I translated the dates and times in my last posting). For a complete review of their findings contact Jim at the DoD Cyber Crime Center.  

Example

Cookie:            __utma

102911388.576917061.1287093264.1287098574.1287177795.3

__utma This records information about the site visited and is updated each time you visit the site.
102911388 This is a hash of the domain you are coming from
576917061 This is a randomly generated number from the Google cookie server
1287093264 This is the actual time of the first visit to the server
576917061.1287093264 These two together make up the unique ID for Google track users. Reportedly Google not track by person information or specific browser information.
1287098574 This is the time of the previous visit to the server
1287177795 This is the time last visited the server
3 This the number of times the site was been visited

 Example

Cookie:            __utmz

102911388.1287093264.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none) 

__utmz This cookie stores how you got to this site.
102911388  Domain hash
1287093264 Timestamp of when the cookie was last set
1 # of sessions at this time
1 # of different sources visitor has used to get to the site.
utmcsr Last website used to access the current website
=(direct) This means I went direct to the website, “Organic” would be from a google search, “Referring link” may show link coming from Search terms may.
|utmccn=(direct)  Adword campaign words can be found here
|utmcmd=(none) Search terms used to get to site may be in cookie here.

 Example

Cookie:            __utmb

102911388.0.10.1287177795 

__utmb This is the session cookie which is only good for 30 minutes.
102911388 This is a hash of the domain you are coming from
0 Number of pages viewed
10 meaning unknown
1287177795 The last time the page was visited

Remember though all of this can be different if the system deletes the cookies or the user runs an application that cleans the cookies out.  Also, it is all relative and depends on system and user behavior and when and how many times they have visited a particular site.

You can also go to find out more about the description of the cookies http://code.google.com/apis/analytics/docs/concepts/gaConceptsCookies.html#cookiesSet

Google Analytics can set four main cookies on the users machine:      

__utma Unique Visitors
__utmb Session Tracking
__utmc Session Tracking
__utmz Traffic Sources

Optional cookies set by Google Analytics:

__utmv Custom Value
__utmx Website Optimizer

Google Analytics creates varying expiration times for its cookies: 

__utma The information on unique user detection expire after 2 years
__utmz The information on tracking expire until 6 months).
__utmv The information on “Custom Tracking” will expire after 2 years
__utmx The information on the “Website Optimizer” will expire after 2 years
  The information about a current visit (visits) will expire after 30 minutes after the last pageview on the domain.

The original code schema written by Urchin was called UTM (Urchin Traffic Monitor) JavaScript code. It was designed to be compatible existing cookie usage and all the UTM cookie names begin with “_utm” to prevent any naming conflicts. 

Tracking the Urchin- from an investigative point of view

Okay so for some additional new stuff on Google analytics when examining the source code of a webpage. What is the Urchin? Google purchased a company called Urchin who had a technology to do traffic analysis. The technology is still referred in the cookies Urchin’s original names.

When examining a live webpage that contains Google analytics code embedded in the website you will come across code that looks similar to this:

<script type=”text/javascript”><!–var gaJsHost = ((”https:” == document.location.protocol) ? “https://ssl.” : “http://www.”);document.write(unescape(”%3Cscript src=’” + gaJsHost + “google-analytics.com/ga.js’ type=’text/javascript’%3E%3C/script%3E”));// –></script><script type=”text/javascript”><!–try {

var pageTracker = _gat._getTracker(”UA-9689708-5″);

pageTracker._trackPageview();

} catch(err) {}

// –></script> 

Search the source code for “getTracker” and you will find the following line: var pageTracker = _gat._getTracker(”UA-9689708-5″); which contains the websites assigned Google analytics account number “UA-9689708-5”. So what does this mean and how can it be of value to me when I am investigating a website? Let’s identify what the assigned number means: 

UA Stands for “Urchin Analytics” (the name of the company Google purchased to obtain the technology)
9689708 Google Analytics account number assigned by Google
5 Website profile number

How can I use this Google analytics number in an investigation? First you can go to http://www.ewhois.com/ to run the UA # and identify the company/person assigned the number.

The reponse you will get is something similar to this:

google analytics

Then run the Google Analytics number through Reverseinternet.com:

urchin

This is a little more of investigative use in that it is showing domains that use the same Google analytics Id, the Internet Protocol addresses assigned to the domains and the DNS servers used by the domains.

Using Reverseinternet.com allows you to identify any webpage where this Google Analytics Id has been embedded in the source code.  This can be of investigative value if the target has used the same Id on more than one webpage they control or monitor. Why would this occur? Google allows the user to monitor data from multiple sites from a single control panel.

So how does Google analytics work?

Google is probably a better place to find this out. You can go to http://code.google.com/apis/analytics/docs/concepts/gaConceptsOverview.html for a complete overview of how it works.

In short Google Analytics java code embedded in the webpage you visit collects information from the following sources when you connect to a webpage:

  • The HTTP request of the visitors browser
  • Browser/system information from the visitor
  • And it sends a cookie to the visiting system

All of this gives the webpage owner the ability to track persons going to their webpage. From an investigative point of view there is a certain amount of exposure due to the browser tracking that occurs and the fact that a cookie is placed on your investigative system. But there is the possibility from examining the page source code to tie the website through the Google Analytics Id to other webpages of interest.

A forensic look at the installed IDrive backup service files

Thursday, August 16th, 2012

I didn’t intend on dissecting files when I started looking at IDrive. My intent was to look at its operation and determine a method of file acquisition as a “Cloud” service. That is still an ongoing project. What I found though is a little disturbing from a user point of view, and fantastic from a forensic point of view. I originally wrote this last year and never got it posted.  When I originally looked at IDrive I found some interesting information. I thought after a year they would have changed their methods of obscuring their client information on the local machine. Alas, no….Here is what I found.

IDrive Background

From their corporate information page IDrive identifies them as a “service” of Pro Softnet Corporation based in Calabasas, California. Pro Softnet has been around since 1995 providing Internet-based solutions. They have several other products including IBackup and RemotePC.

Disturbing Findings or not so disturbing for the Forensic examiner

I downloaded the “Free” version of IDrive’s software.  I wanted to test it and potentially include it in our training as a discussion item on cloud investigation issues. IDrive is unique among the “Online Backup” providers in that they offer “Free” storage of up to 5 GB of data. The other companies in this space seem to only offer a free trial period of their product. IDrive was unique enough that I thought I needed to try it.

This short blog entry is not a review of the entire installation of the software. I did not look in to the registry or examine ever file. I did however find a few things that are worth mentioning for the forensic examiner. I quickly and easily installed their software and easily uploaded some test data into their storage.  I then started to poke around on my machine to identify where IDrive put files.  I did not have to go far. IDrive’s files are found on the local system hard drive under the IDrive folder in the “Program Files” folder.

In the main IDrive folder is the 128 bit “rc4.key” encrypted key file I am sure that is used by the system to communicate with the IDrive server. RC4 is almost 25 years old as an encryption scheme. It however is still in common use today.  I did not examine further its implementation in the communication scheme of the product or try to crack it..

IDrive Temp Folder

In the IDrive “Temp” folder there were two folders with files similarly named. The file “DLLOutput1.txt” contained only an IP address of 206.221.210.66 (and what appears to be a port number of 11663) which belongs to IDrives parent company Pro-Softnet.

The file DLLIntput1.txt similarly contained a small amount of important information. The format was: 

8-16-2012 11-52-21 AM

 We will discuss the username and password translation below.

LDB Folder

In the LDB folder is a file titled “IDriveLDB.IDr”. The file is an SQlite database containing file paths of the data to be backed up.

Log Folder

Under the “Log” folder is another file containing a file named “Realtime Trace.txt”. This file is a log file with connection dates and times.   This file contained the backup up operation to IDrive, which included the IDrive User name, data files names and paths, the start and end time of the backup, the number of files backed up and any excluded files from the backup.

Folder with local computer name

In the folder with the local computers name was found a file titled “Backupfile.txt”. This file contained a list of the files backed up to the IDrive server. In this same folder was another file “BackupSet.txt” that appeared to contain the dates and times of the backups.

IDdrive\”Username”

In the IDrive user folder there is a file called “IDriveE.ini”. The contents are a little lengthier but it is revealing.  At first glance there is the same IP address identified above, the port and much more information. I looked at the lines in the file and realized that some encryption scheme was used. The question is what was it?  Thinking I would not easily find out what the scheme was that was implemented, I used a program to simply try various cyphers common in obfuscation. Without much effort I revealed my passwords and my user name from the text. The obfuscation used by IDrive was a simple 2 position Cesar Cypher.

Text in File Translation
user 2 position Cesar Cypher and is my login user name
User Password=xxxxxxx 2 position Cesar Cypher and is my login password
gpercuuyqtf=xxxxxxxx 2 position Cesar Cypher and is “encpassword=mypassword”
Enc password=xxxxxx 2 position Cesar Cypher and is my encrypted Idrive password but only the first 6 characters of my password
wugtgocknkf=vqffBxgtguqhvyctg0eqo useremailid;todd@veresoftware.com

(Real password has been removed for my security….)

These were not the only lines that used the 2 position Cesar Cypher. In going through the entire file, the lines not in plaintext all used this same cypher to encode their data.

IDriveEUsername_Folder

In reviewing a file named “SerTraceFile.txt” I found a log file with more interesting information about the service and what it collects about my system. The file contained many pieces of information about the IDrive service and the local machine including the local PC name and the NIC card’s MAC address.

Conclusion

WOW….So in looking at IDrive, the “Encrypted” backup service, I found from a forensic point of view, some substantially important failings on the local machine. Well not failings from an investigative point of view, this is actually some great information.  I made no attempt at the writing of this blog entry to use the file information to login from a separate machine. Until Prosoft changes the IDrive local machine files Digital Forensic examiners will have access to some useful information from the IDrive files.

Post script

I am sure this will be changed in a follow-on version by Pro-Soft (at least I hope so), but for the record what I found is limited to my examination of these specific versions on a Windows 7 machine. The IDrive versions I used in this testing were 3.3.4 and3.4.1.

Tracing IP Addresses: Q&A

Friday, February 18th, 2011

We were very pleased to welcome back Dr. Gary Kessler to our “Online Investigations Basics” webinar series this week. Once again Dr. Kessler discussed some of the background and tools relevant to tracing IP addresses. Below is his companion presentation:

During the session, we took several questions from some of our listeners. One person asked whether tracing IP addresses overseas was any different from tracing them domestically. Answer: not technically; the overall process remains the same, but whether American investigators can secure foreign cooperation is a different question. The best bet is for investigators to contact legal representatives in American embassies for help dealing with law enforcement in another country.

Another participant asked whether TCP/IP packets would provide information on what kind of device accessed the Internet; in a related question, someone else asked if MAC addresses from two devices could show that they had been communicating with one another.

By themselves, packets contain no information on the type of device communicating. A device or router is needed to show where an IP address was assigned; the same is true for tracing IP addresses past a private network. And as for MAC addresses, they have only local relevance, not end-to-end applicability.

We wished we could have gotten into more detail about this question: the biggest challenges with tracing IP addresses in the cloud. As the load of traffic increases, and IPv4 addresses diminish (before IPv6 takes hold), more ISPs will begin to allow shared IP addresses. On the flip side, multiple IP addresses will be resolved to single devices.

Again, we’re grateful to Dr. Kessler for taking the time to help educate the community on a complex issue. Have questions? Please contact us. And we’d love to see you at our future “Online Investigations Basics” webinars. In another few weeks, Cynthia Navarro will be talking about online sources of information. We hope you’ll join us!

Cloud computing: Not just for geeks or feds

Monday, February 8th, 2010

Think online investigation is just for the high-tech crimes types, the computer forensics geeks or the feds? Not so, says Todd in his interview with Cyber Speak’s Podcast (hosted, ironically, by two former federal agents). The more people are online, the more they’re likely to use cloud services, the more important it is for local law enforcement to be there too.

Todd’s appearance on Cyber Speak came about because of his two-part article on cloud computing, which had appeared in December in DFI News. He and Ovie Carroll discuss:

Impact of cloud computing on first responders

Detectives performing searches can’t simply pull the plug on a running computer anymore (a fact which prosecutors are having to get used to). They need to be able to perform data triage and possibly even volatile data collection.

Why? Because knowing whether a suspect has an online presence is critical to whether an arrest is made—and what happens afterward. Whether users are actively storing files “in the cloud” or simply members of social networking sites, law enforcement officers who don’t find evidence and therefore, do not make an arrest risk that suspect going online and deleting all incriminating information.

Why is this a problem? Because the very nature of cloud storage means investigators may not be able to access a logical hard drive somewhere to recover the evidence. First, the sheer amounts of data stored on servers make this close to impossible. Second, there are jurisdictional issues.

Are you exceeding your authority?

Not only may information be stored outside your jurisdiction, but it may also be stored in another country altogether—one with different criminal and privacy laws. Accessing evidence of a crime in the United States may actually mean committing a crime in another country (Todd relates the story of two FBI agents for whom arrest warrants were issued in Russia).

This is a problem for local law enforcement, which Todd notes has been left largely to its own devices when it comes to online crime. Only Internet Crimes Against Children (ICAC) task forces have clear direction from the federal government on how to proceed.

Hence it’s easy for local police to kick Internet crimes up to regional, state or federal task forces. But as Todd points out, more people coming online means more crimes being committed against people in local jurisdictions both large and small. Law enforcement at every level needs to be able to respond.

Please listen to Todd and Ovie, and then come back and tell us what you think!

Christa M. Miller is Vere Software’s marketing/public relations consultant. She specializes in law enforcement and public safety and can be reached at christa at christammiller dot com.

A DFI News double feature

Friday, February 5th, 2010

We were pleased and honored in December when Digital Forensics Investigator (DFI) News opted to give two of Todd’s articles top billing on its site.

The articles, a two-part series, addressed whether collection of electronic evidence from the Internet is feasible. Some say no; obviously, we say yes!

In Part I, Todd drew from his 2007 white paper, “Collecting Legally Defensible Online Evidence,” to discuss the need for and development of a standard methodology for Internet evidence collection. In Part II, he addressed the application of that methodology specifically to “cloud” computing.

The cloud does present different challenges to evidence collection than do conventional Internet sources. But that doesn’t mean evidence collection from the cloud is impossible.

Read Part I here and Part II here. And please be sure to come back and tell us what you think. Do you agree? Disagree? Have you encountered the need for Internet evidence collection methodology… or investigative issues specific to the cloud? Comments are open!

Christa M. Miller is Vere Software’s marketing/public relations consultant. She specializes in law enforcement and public safety and can be reached at christa at christammiller dot com.