Some possible M.Sc. Thesis Topics



Distributed middleware cache

The application server tier in multi-tier architectures is responsible for executing the application logic. While it maintains volatile data, mission-critical data is stored in a database backend. However, as database access is expensive, many application servers provide a cache, that stores database records in the object-oriented format needed by the application programs. When application servers are replicated for scalability reasons, a special solution is needed for the application server cache. In this project we develop a distributed object cache where the application server replicas share their cache resources.

There are currently two M.Sc. students working on the topic. I am looking for students continuing the work and exploring other directions. Possible topics to explore in this context are: Development of a cache that supports both distribution/sharing and replication; Load-balancing at the application server tier to optimally use the distributed cache; Exploitation of special hardware to speed up the cache (multi-core, DRAM, ...)

Required knowledge:
good knowledge of distributed systems technology. Experience with large Java-based distributed software systems highly recommended. .

Database Virtualization

Data centers have become a predominant infrastructure for supporting the ever increasing number of e-commerce applications. These data centers often host hundreds and thousands of applications concurrently, distributed over large clusters of server machines.  In order to provide the quality of service required by customers and at the same time exploit the existing infrastructure as efficiently as possible, data centers must be able to dynamically assign, replicate or migrate application components to the different machines in the cluster.

Virtualization is an attractive means to provide component and service migration. Applications are installed on virtual machines which provide an abstraction from the physical machines. Current technology allows virtual machines to be collocated on the same physical machine and to migrate relatively easily between machines.
However, it is not clear how well migration actually work for the different tiers of the typical multi-tier architecture on which current e-commerce application srun.

The task of this thesis is to analyze the capacity of current virtualization products to handle virtualization of a multi-tier architecture. These techniques should then be compared with data transfer, failover and recovery technology developed in the area of database and process replication. A final goal is the development of virtualization techniques specifically designed for multit-tier architectures.

Required knowledge:
good knowledge of the internals of database systems and operating systems. Willingness to explore the technology behind existing systems.


Multicast Communication in Mobile Environments

This work is likely to be in cooperation with SAP Montreal.
While group communication, multicast primitives and publish/subscribe systems are well explored for wired networks, mobile environments pose new challenges. Issues are the low quality network connections of mobile hosts and the variety with which they can communicate to servers (data plan, SMS, wireless access points...), and the huge amount of mobile clients that have to be handled by central services.

A current student is developing a basic publish/subscribe infrastructure that connects mobile clients via a central publish/subscribe server. In a next step we would like to connect the mobile publish/subscribe system with a central server, and see how large mobile user groups can be supported.

Require knowledge:
good knowledge of distributed systems technology. Willingness to work with existing software and simulation.


Peer-to-peer based social networks

In our information-driven society it becomes increasingly challenging to find the information and content we need. So far, we can either use one of the few existing central lookup systems (e.g., Google) or global peer-to-peer (P2P) systems (e.g., BitTorrent). The first has the disadvantage of huge storage and crawling requirements, and is only able to track data that is publicly available on the web. The second is mainly focused on special-purpose data, such as movie files. However, information is often stored in heterogeneous format (emails, software, various file formats, etc.), while the people looking for a special kind of information are quite similar and could be grouped into interest groups. Neither is served well by existing technology. This project envisions a new generation of content management and lookup that is based on the idea of peer-to-peer social networking exploiting social relations to enhance content distribution and lookup service. The goal of our research is to design and develop a P2P content distribution platform where groups of people share application-specific content. Each member is directly linked to his/her best friends or most trustful acquaintances. Paths in the network refer to indirect relationships with decreasing trust. Users actively upload their information to the system where it is cached and indexed in a way that is aware of the social relationships between peers.

The task is to build a basic prototype of such a platform. The research will be done with Institut TELECOM, Telecom SudParis, who already have a system that could be used as basis. .

Require knowledge:
good knowledge of distributed systems technology. Willingness to work with existing software and simulation.


Other possible areas

I also have interest to explore issues in data lineage: a platform to track the movement and development of data in highly distributed systems, to store that data in a central data source, and to provide a query engine that allows to pose interesting queries over such lineage information.

We might be looking for a further student in the Mammoth project, the massively multiplayer online game engine (see the corresponding webpage).

Further topics are possible; if you have a specific idea, let me know.