Resumo: | The new generation of communication networks has brought with them the digitalization of companies and services that have changed not only the way we communicate with each other but also the way we exchange personal and confidential data between people and entities. The IoT is one of the technological paradigms that benefits the most from these new forms of connectivity. The IoT allows us to be always connected to people, companies, our homes, our cities, our intelligent equipment and allows us to automate tasks or control situations remotely that would not be possible without this type of equipment and technology. But with the globalization of networks and services, the need to protect our data and our privacy is something to be concerned about. Although there are already several security options, both in companies and in our service providers, the amount of data that is currently generated far exceeds the capacity, of humans and systems, to analyze what is happening on our networks. In this context, the dissertation presented here will make use of data science and implement machine learning techniques to deal with the volume of data generated by an IoT network. As a scenario, the network of a smart city was chosen, where an intrusion detection system will be placed, supported by a machine learning model so that it is possible to detect any type of activity that is not recognized as being its normal production behavior. The anomaly detection methodology was implemented through machine learning algorithms that enabled the classification of network flows as benign or malicious. By comparing supervised and unsupervised classification algorithms, we found that with a dataset from an IoT network and with flows previously categorized as normal traffic and malicious traffic, supervised classifiers manage to obtain the best results, although they are limited if there is one attack that has not been considered in the given dataset. By combining, in this dissertation, an intrusion detection system with data science and specifically with machine learning models, it was demonstrated that this is a valid cybersecurity solution and that it constitutes an additional layer in terms of ensuring the security of our networks, services, and data.
|