Summary: | Anonymisation is currently one of the biggest challenges when sharing sensitive personal information. Its importance depends largely on the application domain, but when dealing with health information, this becomes a more serious issue. A simpler approach to avoid this disclosure is to ensure that all data that can be associated directly with an individual is removed from the original dataset. However, some studies have shown that simple anonymisation procedures can sometimes be reverted using specific patients’ characteristics, namely when the anonymisation is based on hidden key attributes. In this work, we propose a secure architecture to share information from distributed databases without compromising the subjects’ privacy. The work was initially focused on identifying techniques to link information between multiple data sources, in order to revert the anonymization procedures. In a second phase, we developed the methodology to perform queries over distributed databases was proposed. The architecture was validated using a standard data schema that is widely adopted in observational research studies.
|