Weekly Update

Progress this week:

  • Read material related to analysis of Google cluster trace data
  • Finished project proposal

Plan for next week:

  • Decompress data of Google cluster trace
  • Start working on reading the trace data

Meeting on 25 March

Date: 25 March 2014
Time: 1:30pm – 2:30pm


  1. Discussed previous studies related to Google cluster trace
  2. Discussed a high level design for the simulator
  3. Since the trace data is large (40GB compressed), special techniques will be required to process the data

Planned tasks for next 3 weeks:

  • Continue to read up research papers related to the project
  • Find a way to read the Google cluster trace data from disk
  • If possible, extract information from the data



Before I start, I like to welcome everyone to this blog. I hope I will be able to share with you all experience and progress I have with this project.

Let’s start with a brief introduction of the project. The project aimed to  study how a data center schedule its jobs. The main idea is to construct a simulator that behave like a data center with job dispatcher, scheduler and resources (CPU, memory and disk). An actual trace of the data center is used to facilitate us to understand how a actual data center will work. The Google cluster trace data can be found here http://code.google.com/p/googleclusterdata/.

After the simulator is built, we like to explore the simulator and find any possible weaknesses of the system. Various improvements could be made to address those weaknesses.