Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methods

This study presents a proposal of clustering methodologies for demand pattern recognition using network flow data collected from a large set of drinking water distribution networks in Portugal. Most of the existing studies about clustering in flow time series rely on hierarchical or k-Means clusteri...

ver descrição completa

Detalhes bibliográficos
Autor principal: Gomes, Pedro André Fonseca Garez (author)
Formato: masterThesis
Idioma:eng
Publicado em: 2019
Assuntos:
Texto completo:http://hdl.handle.net/10071/20195
País:Portugal
Oai:oai:repositorio.iscte-iul.pt:10071/20195
Descrição
Resumo:This study presents a proposal of clustering methodologies for demand pattern recognition using network flow data collected from a large set of drinking water distribution networks in Portugal. Most of the existing studies about clustering in flow time series rely on hierarchical or k-Means clustering algorithms with inelastic measures distances. This study explores alternative clustering algorithms, distance measures, comparison time windows, internal index metrics and clustering prototypes. The performance of the alternative clustering methodology was assessed in terms of multiple internal index metrics and the characterization of the cluster centroids. The methods with the best performance were Partition Algorithm with DTW distance, PAM prototype with 15 minutes time window and the Partition Algorithm with GAK distance, PAM prototype and 15 minutes time window because they allow a clear partition of flow time series in three clusters. The first method identifies a night consumption pattern, a typical weekend pattern and a typical working day pattern, whereas the second one identifies a pattern with small variability between night and daily consumption. To improve knowledge extraction, in terms of typical and anomalous existing patterns, additional clustering operations were performed with the flow data set that belongs to the cluster with small variability between night and daily consumption. New clusters were identified and characterized regarding weekday, geographical location, and dry months and wet months, showing that patterns associated with garden irrigation are independent of the period of the day and season of the year, which indicates an inefficient water use.