ReLU and sigmoidal activation functions

Pretorius, Arnold M.; Barnard, Etienne; Davel, Marelie H.

ReLU and sigmoidal activation functions

Date

2019-12

Authors

Pretorius, Arnold M.

Barnard, Etienne

Davel, Marelie H.

Publisher

In Proc. South African Forum for Artificial Intelligence Research (FAIR2019)

Abstract

The generalization capabilities of deep neural networks are not well understood, and in particular, the influence of activation functions on generalization has received little theoretical attention. Phenomena such as vanishing gradients, node saturation and network sparsity have been identified as possible factors when comparing different activation functions [1]. We investigate these factors using fully connected feedforward networks on two standard benchmark problems, and find that the most salient differences between networks with sigmoidal and ReLU activations relate to the way that class-distinctive information is propagated through a network.

Keywords

Non-linear activation function, Generalization, Activation distribution, Sparsity

Citation

Arnold M. Pretorius, Etienne Barnard and Marelie H. Davel, “ReLU and sigmoidal activation functions“, In Proc. South African Forum for Artificial Intelligence Research (FAIR2019), pp37-48, Cape Town, South Africa, December 2019.

URI

http://hdl.handle.net/10394/33957

Collections

Faculty of Engineering

Full item page

ReLU and sigmoidal activation functions

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By