COMP-614: Distributed Data Management
Project Guidelines
Outline
The project work is an essential part of this course. Its aim is to get
you started with doing independent research / work in the area of
distributed systems and distributed information systems. During the
first part of the project you will do some literature research within a
specific topic and write
a survey report about your readings (around 25 pages, 12pt, 1.5
spacing). You will present the survey to the class in form of a lecture
with discussion. During the second part of the project you
should
elaborate a new problem and its solution within this research
area. There are three alternatives:
- A research proposal for a relatively complex problem. I do not
expect a complete solution but a description of the approach you would
take to obtain the solution. Your work should demonstrate an
understanding of the research area and an insight into the problem such
that given enough time (2 to 5 more
months), you could carry it to its logical conclusion and complete the
research.
The delivery is a report.
- A complete solution to a small-sized, more specialized problem.
The delivery is a report.
- An implementation and evaluation of algorithms presented in some
research paper. The delivery is a report with an overview of the
implementation and the evaluation results.
The resulting report should also contain around 20-25 pages.
Schedule
- survey lecture: first to given in 2 weeks
- date to be determined in February: turn in of survey report
- beginning of march: 1-3 page plan of what you want to do in your
research report
- end of term: turn in of research report
Talk Schedule, see main page
On Pursuing Research
Literature Search
Below you will find several possible research topics from which you can
pick one. For each of the topics I will later provide at least three
research
papers that can serve as starting points for your literature study. I
expect you to include at least one more paper in your research review.
If you find other papers related to the topic, that you think are
better, feel free to change. For your talk and your basis for your own
study, feel free to look at more papers or even choose different
papers. You should
look at the proceedings of the following conferences. For databases
oriented topics
ICDE, VLDB, SIGMOD ... for more distributed system topics Middleware,
ICDCS,
DSN, SOSP. You do not need to look at journals but if you see an
interesting sounding journal paper, the following journals are good:
IEEE and ACM Transactions on.... journals (transactions on database
systems, transactions on computer systems, etc.), Information
Systems, IEEE TKDE, VLDB Journal. A good starting point for you own
literature search is google scholar or
Michael
Ley's Computer Science Bibliography . The first one is maintained
automatically, and hence, also contains a lot of duplicate information,
etc. The latter is
more structured and consistent. Citeseer is also a good starting point. You
will
find many papers online. You will also have access to many of the research
papers
through the McGill
library system. A more recent resource is
McGill has subscribed to the digital libraries for ACM, IEEE, and
Springer. You can go directly to their webpages and have access if you
connect from a computer within McGill
Research Topics
- RESOURCE ALLOCATION
- Pradeep Padala, Kai-Yuan Hou, Kang G. Shin, Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal, Arif Merchant:
Automated control of multiple virtualized resources. 13-26, EuroSys 2009
- Pradeep Padala, Kang G. Shin, Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal, Arif Merchant, Kenneth Salem:
Adaptive control of virtualized resources in utility computing environments. 289-302, EuroSys 2007
- Chuliang Weng, Minglu Li, Zhigang Wang, Xinda Lu:
Automatic Performance Tuning for the Virtualized Cluster System. 183-190, ICDCS 2009
- Gueyoung Jung, Kaustubh R. Joshi, Matti A. Hiltunen, Richard D. Schlichting, Calton Pu: A Cost-Sensitive Adaptation Engine for Server Consolidation of Multitier Applications. Middleware 2009: 163-183
- Ahmed A. Soror, Umar Farooq Minhas, Ashraf Aboulnaga, Kenneth Salem, Peter Kokosielis, Sunil Kamath: Automatic virtual machine configuration for database workloads. SIGMOD 2008:953-966
- VIRTUAL MACHINE MIGRATION
- Timothy Wood, Prashant J. Shenoy, Arun Venkataramani, Mazin S. Yousif: Black-box and Gray-box Strategies for Virtual Machine Migration. NSDI 2007
- Michael R. Hines, Kartik Gopalan: Post-copy based live virtual machine migration using adaptive pre-paging and dynamic self-ballooning. VEE 2009:51-60
- Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, Andrew Warfield: Live Migration of Virtual Machines. NSDI 2005
- OBJECT REPLICATION
- Ming Zhong, Kai Shen, Joel I. Seiferas:
Replication degree customization for high availability. 55-68, EuroSys 2008
- Jörg Domaschka, Thomas Bestfleisch, Franz J. Hauck, Hans P. Reiser, Rüdiger Kapitza:
Multithreading Strategies for Replicated Objects. 104-123, Middleware 2008
- Tudor Marian, Mahesh Balakrishnan, Ken Birman, Robbert van Renesse:
Tempest: Soft state replication in the service tier. 227-236, DSN 2008
- POWER
- Sriram Govindan, Jeonghwan Choi, Bhuvan Urgaonkar, Anand Sivasubramaniam, Andrea Baldini:
Statistical profiling-based techniques for effective power provisioning in data centers. 317-330, EuroSys 2009
- David C. Snowdon, Etienne Le Sueur, Stefan M. Petters, Gernot Heiser:
Koala: a platform for OS-level power management. 289-302, EuroSys
- Dara Kusic, Jeffrey O. Kephart, James E. Hanson, Nagarajan Kandasamy, Guofei Jiang: Power and performance management of virtualized computing environments via lookahead control. Cluster Computing (CLUSTER) 12(1):1-15 (2009)
- Liang Liu, Hao Wang, Xue Liu, Xing Jin, Wenbo He, QingBo Wang, Ying Chen, "GreenCloud: A New Architecture for Green Data Center, " in Proceedings of International Conference on Autonomic Computing and Communications (ICAC 2009)
- REPLICATED DATABASES / CLUSTER
- Sameh Elnikety, Steven G. Dropsho, Emmanuel Cecchet, Willy Zwaenepoel:
Predicting replicated database scalability from standalone database profiling. 303-316, EuroSys 2009
- Sameh Elnikety, Steven G. Dropsho, Willy Zwaenepoel:
Tashkent+: memory-aware load balancing and update filtering in replicated databases. 399-412, EuroSys 2007
- Kaloian Manassiev, Cristiana Amza: Scaling and Continuous Availability in Database Server Clusters through Multiversion Replication. DSN 2007: 666-676
- Jin Chen, Gokul Soundararajan, Cristiana Amza: Autonomic Provisioning of Backend Databases in Dynamic Content Web Servers. ICAC 2006: 231-242
- REPLICATED DATABASES / WIDE-AREA
- Prince Mahajan, Ramakrishna Kotla, Catherine C. Marshall, Venugopalan Ramasubramanian, Thomas L. Rodeheffer, Douglas B. Terry, Ted Wobber:
Effective and efficient compromise recovery for weakly consistent replication. 131-144, EuroSys 2009
- Weihan Wang, Cristiana Amza: On Optimal Concurrency Control for Optimistic Replication. ICDCS 2009: 317-326, ICDCS 2009
- Joăo Barreto, Paulo Ferreira:
Efficient Locally Trackable Deduplication in Replicated Systems. 103-122, Middleware 2009
- CACHING
- Alexander Rasmussen, Emre Kiciman, V. Benjamin Livshits, Madanlal Musuvathi:
Improving the responsiveness of internet services with automatic cache placement. 27-32, EuroSys 2009
- Uwe Röhm, Sebastian Schmidt: Freshness-Aware Caching in a Cluster of J2EE Application Servers. WISE 2007: 74-86
- Xin Liu, Ashraf Aboulnaga, Kenneth Salem, Xuhui Li: CLIC: CLient-Informed Caching for Storage Servers. FAST 2009:297-310
- Debabrata Dash, Verena Kantere, Anastasia Ailamaki: An Economic Model for Self-Tuned Cloud Caching. ICDE 2009:1687-1693
- Gokul Soundararajan, Jin Chen, Mohamed A. Sharaf, Cristiana Amza: Dynamic partitioning of the cache hierarchy in shared data centers. PVLDB 1(1): 635-646 (2008)
- HARDWARE
- Philip Werner Frey, Gustavo Alonso:
Minimizing the Hidden Cost of RDMA. 553-560, ICDCS 2009
- Wei Huang, Qi Gao, Jiuxing Liu, Dhabaleswar K. Panda: High performance virtual machine migration with RDMA over modern interconnects. CLUSTER 2007:11-20
- Ronald Veldema, Michael Philippsen: Evaluation of RDMA Opportunities in an Object-Oriented DSM. LCPC 2007:217-231
Philip Werner Frey, Romulo Goncalves, Martin L. Kersten, Jens Teubner: Spinning relations: high-speed networks for distributed join processing. DaMoN 2009: 27-33
- NEW PARADIGMS
- Marcos Kawazoe Aguilera, Arif Merchant, Mehul A. Shah, Alistair C. Veitch, Christos T. Karamanolis:
Sinfonia: a new paradigm for building scalable distributed systems. 159-174, SOSP 2008
- Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, Werner Vogels:
Dynamo: amazon's highly available key-value store. 205-220, SOSP 2008
- Jeffrey Dean, Sanjay Ghemawat: MapReduce: Simplified Data Processing on Large Clusters. OSDI 2004:137-150
- Matthias Brantner, Daniela Florescu, David A. Graf, Donald Kossmann, Tim Kraska: Building a database on S3. SIGMOD 2008:251-264
- FAULT TOLERANCE IN MULTI-TIER ARCHITECTURES
- Dynamic Content Web Applications: Crash, Failover, and Recovery Analysis
Luiz E. Buzato, Gustavo M. D. Vieira, Willy Zwaenepoel, IC, Unicamp Campinas, Brasil, SCCS, EPFL, Switzerland, DSN 2009
- Manish Marwah, Shivakant Mishra, Christof Fetzer: Enhanced server fault-tolerance for improved user experience. DSN 2008: 167-176
- Huaigu Wu, Bettina Kemme: A Unified Framework for Load Distribution and Fault-Tolerance of Application Servers. Euro-Par 2009: 178-190
- LOAD-BALANCING IN MULTI-TIER ARCHITECTURES
- Jeremy Philippe, Noel De Palma, Fabienne Boyer, Olivier Gruber:
Self-adapting Service Level in Java Enterprise Edition. 143-162, Middleware 2009
- Giuliano Casale, Amir Kalbasi, Diwakar Krishnamurthy, Jerry Rolia:
Automatic Stress Testing of Multi-tier Systems by Dynamic Bottleneck Switch Generation. 393-413, Middelware 2009
Wen-Syan Li, Daniel C. Zilio, Vishal S. Batra, Mahadevan Subramanian, Calisto Zuzarte, Inderpal Narang: Load Balancing for Multi-tiered Database Systems through Autonomic Placement of Materialized Views. ICDE 2006:102
- PEER-TO-PEER AND GAMES
- Stefano Ferretti: A synchronization protocol for supporting peer-to-peer multiplayer online games in overlay networks. DEBS 2008:83-94
- Arne Schmieg, Michael Stieler, Sebastian Jeckel, Patric Kabus, Bettina Kemme, Alejandro P. Buchmann: pSense - Maintaining a Dynamic Localized Peer-to-Peer Structure for Position Based Multicast in Games. Peer-to-Peer Computing 2008:247-256
- Ashwin R. Bharambe, John R. Douceur, Jacob R. Lorch, Thomas Moscibroda, Jeffrey Pang, Srinivasan Seshan, Xinyu Zhuang: Donnybrook: enabling large-scale, high-speed, peer-to-peer games. SIGCOMM 2008:389-400
- PUBLISH-SUBSCRIBE
- Ashwin R. Bharambe, Sanjay G. Rao, Srinivasan Seshan: Mercury: a scalable publish-subscribe system for internet games. NETGAMES 2002:3-9
- Antonio Carzaniga, David S. Rosenblum, Alexander L. Wolf: Design and evaluation of a wide-area event notification service. ACM Trans. Comput. Syst. 19(3): 332-383 (2001)
- DEBS conference
- PROBABILISTIC MULTICAST
- Davide Frey, Rachid Guerraoui, Anne-Marie Kermarrec, Boris Koldehofe, Martin Mogensen, Maxime Monod, Vivien Quéma: Heterogeneous Gossip. Middleware 2009: 42-61
- Patrick Th. Eugster, Rachid Guerraoui, Sidath B. Handurukande, Petr Kouznetsov, Anne-Marie Kermarrec: Lightweight probabilistic broadcast. ACM Trans. Comput. Syst. 21(4): 341-374 (2003)
- Milan Vojnovic, Varun Gupta, Thomas Karagiannis, Christos Gkantsidis: Sampling Strategies for Epidemic-Style Information Dissemination. INFOCOM 2008:1678-1686
- PEER-TO-PEER SYSTEMS AND REPLICATION
- Saurabh Tewari, Leonard Kleinrock: Proportional Replication in Peer-to-Peer Networks. INFOCOM 2006
- Elias Leontiadis, Vassilios V. Dimakopoulos, Evaggelia Pitoura: Creating and Maintaining Replicas in Unstructured Peer-to-Peer Systems. Euro-Par 2006:1015-1025
- Saurabh Tewari, Leonard Kleinrock: Analysis of search and replication in unstructured peer-to-peer networks. SIGMETRICS 2005:404-405
- Qiang Wang, Khuzaima Daudjee, M. Tamer Özsu: Popularity-Aware Prefetch in P2P Range Caching. Peer-to-Peer Computing 2008: 53-62
- SOCIAL NETWORKS
- David Carmel, Naama Zwerdling, Ido Guy, Shila Ofek-Koifman, Nadav Har'El, Inbal Ronen, Erel Uziel, Sivan Yogev, Sergey Chernov: Personalized social search based on the user's social network. CIKM 2009:1227-1236
- Mohammad Hossein Manshaei, Julien Freudiger, Márk Félegyházi, Peter Marbach, Jean-Pierre Hubaux: On Wireless Social Community Networks. INFOCOM 2008: 1552-1560