Motivation
In a smart city environment, the convergence of mobile communication, sensors and online social networks technologies lead to an exponential increase in the creation and consumption of personal data that are produced or can be linked to individuals. Moreover, in the vision of the smart city that we promote here, it is not sufficient to simply be aware of the physical state of the city, but we need as importantly to be aware of its social state, to give birth to new applications. However, such combination of personal data leads potentially to describe all of the activities of citizens’ life. It is then a significant threat to privacy, which must be overcome. The respect of the citizens’ privacy principles is a fundamental value of our smart city vision upheld by social sustainability, leading to incorporate by design privacy enforcement techniques in the underlying urban middleware.
Goals
We will design and validate new data sharing and usage models, user-centric and placing privacy upfront, interfaced with the urban middleware introduced in Section 3.2.2. Our approach [Anciaux13] is to enable users to exercise a control over the usage made of their data by introducing personal trusted cells in the architecture. Trusted cells are units of secure hardware and software, owned at the citizens’ side, and able to perform data management tasks in a privacy-preserving manner. Since all data management tasks cannot be processed within personal trusted cells, we will also interface the trusted cells with personal and secure data management services running in sandboxed containers as part of the middleware system. The resulting privacy by design urban middleware system will provide: (1) strong privacy guarantees to citizens, based on new data sharing and usage control model and user-centrism, and (2) the ability to integrate physical and social sensing technologies.
State of the art and challenges
We identify three broad areas of contribution for our work to build the privacy by design urban middleware system:
- Enforced access control and data sharing models: Social sensing induces producing and managing personal data at user side. Access and usage control policies must be implemented and enforced by the citizens trusted cells, and interconnect appropriately with middleware services. We are currently exploring a solution relying on two building blocks to provide such privacy preserving data management features. First, we identify and design a set of rich operations and appropriate algebra handled within the trusted cells needed to perform local data computations, and disseminate only the results of these computation instead of raw data needed to compute this result. It is however obvious that this solution does not cover all data oriented treatments required by the urban middleware and the applications. In practice, many data processing tasks cannot be performed locally in citizens trusted cells because they remain specific and need important computing resources. A second aspect of the solution is then based on the definition of sandboxed computation containers, placed on the middleware side, and able to interact with citizens trusted cells to perform more complex data computations without data leak. Using these two building blocks, applications may be partitioned into several parts (trusted cell data processing performed at user side, middleware data processing performed in sandboxed containers in interaction with the trusted cells), with the potential to effectively enforce access to computed results without leaks.
- Trusted cells for the smart city: In the smart city context, many devices situated at the edge of the network can be considered as potential trusted cells, and others may be created. To start with, we will concentrate on two kinds of emblematic devices: (i) microcontroller-based tokens (e.g., sensors confined in quantified-self appliances or SIM cards embedded in cell phones, which today represent in quantity a large amount of available trusted cells) and (ii) more powerful personal platforms (e.g., cell phones or tablets endowed with a Trustzone [ARM08] processor, secure personal cloud data platforms as envisioned in [Lallali15]). The question we want to solve is how to construct trusted cells based on such devices? First, the device must be able to evaluate access and usage control primitives. While this is not an issue for traditional (powerful) database servers, this is still an open issue in the context of microcontroller-based tokens and trusted platforms like Trustzone. Preliminary proposals investigate the computation of simple authorized views based on filtering predicates within sensors nodes (microcontrollers) equipped with large Flash memory (e.g., [TD11]). Other studies start addressing the support of relational operations like selection, projection and pre-computed join within secure microcontrollers [To14 or full-text searches over collections of personal documents or trails [Anciaux15]. Other complex treatments are however required to support data dissemination from the edge of the network, e.g., access control models for time series.
- Global and anonymous large scale computations: To compute results on a population of individuals, a very large number of trusted cells may be involved. The main target is to organize the computation such that only the result is revealed, but not the raw data, and not the identity of the individuals that did participate in the computation. A large set of trusted cells with limited resources and potentially low connectivity may be involved. Previous works addressed the problem of Privacy-Preserving Data Publication [Allard14] or distributed SQL queries computation [To14], which proves the feasibility of large scale distributed computations over trusted cells. But these work rely on the strong security assumption that the involved trusted cells cannot be compromised. This may not be the case in practice, as many trusted cells are let into the users’ hands (e.g., quantified-self devices). In a smart city environment, computing regular global queries, e.g., basic aggregations to discover overcrowded roads, conduct public surveys or enable large scale comparisons within quantified-self applications, is thus still an open issue. A new computation paradigm should be worked out, where the potential data leaks could be controlled and avoided, even though some participating trusted cells are compromised.
Methodology
We will first define a general privacy preserving data management architecture, relying on the trusted cell vision, and compliant with the requirements and constraints of the urban middleware. This means in particular to (1) define what kind of features and API trusted cells should offer, and (2) define the main local and global data management services to cover the basic needs of data-oriented treatments operated by the urban middleware and the applications (data availability, durability and sharing, queries, user defined functions, etc.). This preliminary study will remain independent of the instance of trusted cell platform considered. Second, we will consider specific instances of trusted cells, with a particular interest for secure hardware platforms that can be available at the citizen side. Third, we will design and implement a set of trusted cell services, some being embedded within the trusted cells, and others being implemented as sandboxed web services as part of the urban middleware, without diminishing the privacy level offered by the trusted cells.
References
[Anciaux15] Anciaux, N., Lallali, S., Popa, I. S., & Pucheral, P. (2015). A scalable search engine for mass storage smart objects. Proceedings of the VLDB Endowment, 8(9), 910-921.
[Anciaux14a] Anciaux, N., Bouganim, L., Pucheral, P., Guo, Y., Le Folgoc, L., & Yin, S. (2014). MILo-DB: a personal, secure and portable database machine. Distributed and Parallel Databases, 32(1), 37-63.
[Allard14] Allard, T., Nguyen, B., & Pucheral, P. (2014). MET?P: revisiting Privacy-Preserving Data Publishing using secure devices. Distributed and Parallel Databases, 32(2), 191-244.
[To14] To, Q. C., Nguyen, B., & Pucheral, P. (2014). Privacy-Preserving Query Execution using a Decentralized Architecture and Tamper Resistant Hardware. In EDBT (pp. 487-498).