NWU Institutional Repository

ReLU and sigmoidal activation functions

dc.contributor.authorPretorius, Arnold M.
dc.contributor.authorBarnard, Etienne
dc.contributor.authorDavel, Marelie H.
dc.date.accessioned2020-01-27T13:27:30Z
dc.date.available2020-01-27T13:27:30Z
dc.date.issued2019-12
dc.description.abstractThe generalization capabilities of deep neural networks are not well understood, and in particular, the influence of activation functions on generalization has received little theoretical attention. Phenomena such as vanishing gradients, node saturation and network sparsity have been identified as possible factors when comparing different activation functions [1]. We investigate these factors using fully connected feedforward networks on two standard benchmark problems, and find that the most salient differences between networks with sigmoidal and ReLU activations relate to the way that class-distinctive information is propagated through a network.en_US
dc.identifier.citationArnold M. Pretorius, Etienne Barnard and Marelie H. Davel, “ReLU and sigmoidal activation functions“, In Proc. South African Forum for Artificial Intelligence Research (FAIR2019), pp37-48, Cape Town, South Africa, December 2019.en_US
dc.identifier.issn1613-0073
dc.identifier.urihttp://hdl.handle.net/10394/33957
dc.language.isoenen_US
dc.publisherIn Proc. South African Forum for Artificial Intelligence Research (FAIR2019)en_US
dc.subjectNon-linear activation functionen_US
dc.subjectGeneralizationen_US
dc.subjectActivation distributionen_US
dc.subjectSparsityen_US
dc.titleReLU and sigmoidal activation functionsen_US
dc.typeOtheren_US

Files

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed upon to submission
Description: