Resumo: | Online social media is today used during humanitarian disasters by victims, responders, journalists and others, to publicly exchange accounts of ongoing events, requests for help, aggregate reports, reflections and commentary. In many cases, incident reports become available on social media before being picked up by traditional information channels, and often include rich evidence such as photos and video recordings. However, individual messages are sparse in content and message inflow rates can reach hundreds of thousands of items per hour during large scale events. Current information management methods struggle to make sense of this vast body of knowledge, due to limitations in terms of accuracy and scalability of processing, summarization capabilities, organizational acceptance and even basic understanding of users’ needs. If solutions to these problems can be found, social media can be mined to offer disaster responders unprecedented levels of situational awareness. This thesis provides a first comprehensive overview of humanitarian disaster stakeholders and their information needs, against which the utility of the proposed and future information management solutions can be assessed. The research then shows how automated online textclustering techniques can provide report de-duplication, timely event detection, ranking and summarization of content in rapid social media streams. To identify and filter out reports that correspond to the information needs of specific stakeholders, crowdsourced information extraction is combined with supervised classification techniques to generalize human annotation behaviour and scale up processing capacity several orders of magnitude. These hybrid processing techniques are implemented in CrisisTracker, a novel software tool, and evaluated through deployment in a large-scale multi-language disaster information management setting. Evaluation shows that the proposed techniques can effectively make social media an accessible complement to currently relied-on information collection methods, which enables disaster analysts to detect and comprehend unfolding events more quickly, deeply and with greater coverage.
|