Context of Work:
Within Citylab@Inria, we consider that smart cities should emphasize the awareness and participation of citizens in the governance of the city (aka “citizen as a sensor and actuator”), through the comprehensive exploitation of participatory-enabling technologies like Internet-powered crowd-Xing but also public data opening. From a data management perspective, this vision is at the convergence of “open data” and “crowd sensing data“. Open data describe the city and are made publicly available by city representatives and stakeholders. Crowd sensing data are collected from volunteer citizens either explicitly (e.g., a citizen indicates that a metro train is overcrowded) or automatically from citizens’ devices like smartphones or wearable sensors. Cross-exploitation of open data and crowd sensing is expected to give birth to many new interesting urban services as illustrated below.
The Urban Civics service developed within CityLab@Inria is focused on eliciting urban maps about environmental hazards like noise pollution or air quality, with a high level of confidence (e.g., see  for the case of noise pollution). Such urban maps may be computed using advanced data assimilation techniques, which refer to the combination of all the relevant information sources so as to deliver the best estimate of the environment state (e.g., the pollution levels) throughout the city. The information sources are typically numerical simulation and observations from fixed and mobile sensors. Results of numerical simulation as well as data gathered from fixed sensors may commonly be made available by city governments as open data. As for data collected from mobile sensors that are carried by citizens, they are knowledgeably less accurate, while being cheap and potentially numerous, although depending on the actual involvement of citizens.
The above monitoring of urban pollution further allows informing citizens about the resulting impact on their health, thanks to published clinical studies together with the increased availability of smart health watches/bracelets (e.g., see ). In particular, stress levels may be correlated with exposure to urban pollution. As a result, citizens are better aware about pollution and its impact, and may take action to reduce their exposure and ultimately pollution. In general, Urban Civics intends to offer a set of applications to better understand environmental hazards based on citizen participation. However, such urban participatory sensing significantly challenges privacy, which is the focus of the proposed PhD topic.
While citizen’s mobile devices and especially smartphones may significantly help sensing urban data for the good of the individual and of the community, the benefits cannot be achieved without addressing the evident privacy questions that arise. In particular, we have to distinguish between crowd-sensing data that may (or should) be anonymous from those sensed data that must be considered as personal information (typically, because they reveal sensitive information, e.g., location, health, habits or activity, of an identified citizen).
Going back to the Urban Civics use case, the noise map may be fed using anonymous contributions. Then, to provide such anonymity guarantees to the volunteer citizens, privacy preserving techniques must be embedded in the sensing application (e.g., to limit the data collection to only anonym and infrequent contributions) and strictly apply major privacy principles to the crowd-sensing central server collecting all the anonymous contributions (e.g., no retention of IP addresses to avoid data linkage, limited collection and limited exposure techniques, etc.). However, still considering the Urban Civics use case, the target correlation between pollution- and health-related data involves monitoring personal data, as it is typically the case of any “quantified-self” application that is often based on spatio-temporal data and personal time series. Indeed, it is now well known that the person’s identity can easily be inferred from a geo-localized trail, even if pseudonymized  or made of location samples sequences . As a consequence, spatio-temporal trails are now considered as personal information . Time series produced by body sensors may reflect health and are then highly sensitive. Citizens’ global characteristics or attributes (age, gender, health, employment, social relations, etc.), necessary to cluster and compare the trails produced by different citizens, can be used (from a privacy perspective) as quasi-identifiers to relate the digital trails to the citizen’s identity. As a consequence, at least spatio-temporal trails, body sensor trails and citizens’ attributes should be considered as personal data and managed accordingly.
From our perspective, personal/sensitive crowd-sensing data should be managed under the control of the citizen owning the data and come along with strong guarantees in terms of security and privacy. Toward that goal, our approach relies on the notion of Secure Personal Cloud, which can be thought of as a secure dedicated box belonging to a given citizen (kept at home or remotely) and in charge of organizing the personal data space in a database style. The Secure Personal Cloud is specifically designed so as to: ease personal data management, allow crossing data from multiple “data silos” and protect personal data against loss, theft and abusive use. Many projects and startups currently investigate Personal Cloud solutions (e.g., see OpenPDS, CozyCloud, OwnCloud). The specificity of our approach, further exposed in , is to rely on a secure database machine  to enforce the privacy rules established on the Personal Cloud.
A Secure Personal Cloud platform designed for citizens of smart cities should be able to: manage personal crowd-sensing data (in particular spatio-temporal trails and body sensor trails), interoperate with the citizen’s carried sensors, and perform privacy-preserving cross-computations involving personal data and centralized urban services. While the general objective is easy to express, putting this concept into practice raises a number of challenges among which:
- Data collection and access rights: Assuming that the citizen’s personal data is issued by the citizen’s devices, how to convey and host this information in the Secure Personal Cloud? How is the data further queried by the citizen’s mobile applications and the authorized urban services according to the dissemination (or access control) rules expressed by the citizen owning the data?
- Cross-computations with urban services and data sharing: How can citizens confront their own digital trail with centralized public and anonymous urban data? How to share a transversal subset of personal data within a group of citizens with minimal knowledge about data usage by third parties, while guaranteeing privacy to the contributing citizens?
- Global computations: How can citizens participate to a large scale distributed computation, involving large communities of anonymous citizens, while preserving their anonymity?
Our objective is to tackle the three above challenges by designing and implementing a Privacy-by-Design Personal Cloud for urban systems. The platform will rely on the combination of two complementary technologies investigated by the Inria MIMOVE (http://mimove.inria.fr/) and SMIS (http://www-smis.inria.fr/) teams, namely: (1) an open source home-hosted decentralized middleware enabling urban-scale mobile crowd-sensing and (2) a hardware-based solution connected with the middleware to protect the data and regulate data exchanges and usages offered by the middleware.
 M. Gruteser and B. Hoh. On the anonymity of periodic location samples. In Security in Pervasive Computing, pages 179–192. Springer, 2005.
 R.A. Popa, A.J. Blumberg, H. Balakrishnan, and F.H. Li. Privacy and accountability for location-based aggregate statistics. In Proceedings of the 18th ACM conference on Computer and communications security, pages 653–666. ACM, 2011.
 T. Xu and Y. Cai. Feeling-based location privacy protection for location based services. In Proceedings of the 16th ACM conference on Computer and communications security, pages 348–357. ACM, 2009.
 Sara Hachem, Vivien Mallet, Ventura Raphaël, Pierre-Guillaume Raverdy, Animesh Pathak, Valérie Issarny, Rajiv Bhatia. Monitoring Noise Pollution Using The Urban Civics Middleware. In IEEE BigDataService 2015, Mar 2015. To appear. https://hal.inria.fr/hal-01109321.
 Sara Hachem, Georgios Mathioudakis, Animesh Pathak, Valérie Issarny, Rajiv Bhatia. Sense2Health: A Quantified Self Application for Monitoring Personal Exposure to Environmental Pollution. In SENSORNETS 2015, Feb 2015. To appear. https://hal.inria.fr/hal-01102275.
 Nicolas Anciaux, Philippe Bonnet, Luc Bouganim, Philippe Pucheral. Trusted Cells: Ensuring Privacy for the Citizens of Smart Cities. ERCIM News Vol. 2014(98), 2014.
 Nicolas Anciaux, Luc Bouganim, Philippe Pucheral, Yanli Guo, Lionel Le Folgoc, Shaoyi Yin. MILo-DB: a personal, secure and portable database machine. Distributed and Parallel Databases, Vol. 32(1), pp. 37-63, 2014.
Conditions for application:
- The candidate must hold a Master’s in computer science or an equivalent diploma.
- Candidates who got their Master degree in the Paris region cannot apply for this position, while there is no restriction on the candidates’ nationality.
How to apply:
- Detailed CV including full publication record
- Letter of motivation
- 3 references
The position is to be filled ASAP and application may be submitted till the position is filled.