Tracking Gauteng thunderstorms using Crowdsourced Twitter data between Soweto and Pretoria
Abstract
Summer thunderstorms in Gauteng are often dramatic, noisy, wet events. They can
appear suddenly on exceptionally hot sunny days travelling fast across the province.
With such dramatic arrivals, people often flock to social media sites such as Twitter to
comment on the rain, wind, hail, lightning and thunder. This paper investigates the
possibility of mapping the track of Gauteng thunderstorms by using crowdsourced data
from Twitter. This paper describes a model (entitled the ThunderChatter Model) and
instantiation of that model which extracts data from Twitter, analyses the textual
information for thunderstorm information and plots the appropriate data on a map. For
evaluation purposes, these generated maps are then compared against lightning-stroke
maps provided by the South African Weather Service. The maps are visually compared
by independent people using Content Analysis techniques ensuring unbiased and
reproducible results. The results of this research are mixed. For thunderstorms which
traverse the strip of land between Soweto and Pretoria more or less correlated to the N1
highway (and representing the most heavily populated area of Gauteng and the area with
the highest percentage of home Internet facilities), the results are excellent. However, in
outlying areas of Gauteng such as Carletonville, Heidelberg, Hammanskraal and
Bronkhorstspruit, the thunderstorms are only trackable using crowdsourced Twitter data
in the case of extreme storms which damage property. The results imply that data
obtained from social media could be used in some cases to supplement geographical data
obtained from traditional sources.