Extração de Dados via Web Scraping como Suporte em Análises Envolvendo a Geração Distribuída

  • Renan Moreira Soares Instituto Federal de Goiás, Itumbiara, Goiás
  • Guilherme Rezende Pereira Camargo Instituto Federal de Goiás, Itumbiara, Goiás
  • Marcelo Escobar de Oliveira Instituto Federal de Goiás, Itumbiara, Goiás
  • Leonardo Garcia Marques Instituto Federal de Goiás, Itumbiara, Goiás
Keywords: data extraction, distributed generation, socioeconomic factors, statistic, web scraping


In recent years, relevant growth has been evidenced in the photovoltaic generation sector in Brazil. The analysis of this growth is fundamental for decision-making, both in the public and private sectors. This growth can cause major impacts, both on the electrical system and on the quality of electricity delivered to consumers. Therefore, investigating the factors that can influence this increase is of supreme importance so that actions and investments can be carried out towards the improvement of the electricity networks. To verify the influence of these socioeconomic factors on the growth, statistical studies have been developed. However, to carry them out, a huge amount of data needs to be collected to ensure robust analysis. These data can be collected manually, on the internet, on the websites where are available. However, this manual retrieval is slow, error-prone and can compromise data reliability. So, in this work, web scrapers tools are presented to collect data from two different sites, supporting different research that can be carried out on the growth of distributed generation. At the end, an analysis with the collected data is shown, demonstrating the usefulness of these tools.