neural network

{{Short description|Structure in biology and artificial intelligence}}

{{other uses|Neural network (disambiguation)}}

A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either biological cells or mathematical models. While individual neurons are simple, many of them together in a network can perform complex tasks. There are two main types of neural networks.

In biology

File:Projections of Gpr101 TomatoMSNsinSTR.gif of part of a biological neural network in a mouse's striatum]]

{{main|Neural network (biology)}}

In the context of biology, a neural network is a population of biological neurons chemically connected to each other by synapses. A given neuron can be connected to hundreds of thousands of synapses.

{{cite journal

|last1=Shao |first1=Feng

|last2=Shen |first2=Zheng

|date=9 January 2022

|title=How can artificial neural networks approximate the brain?

|journal=Front. Psychol.

|volume=13

|page=970214

|doi=10.3389/fpsyg.2022.970214

|doi-access=free

|pmid=36698593

|pmc=9868316

}}

Each neuron sends and receives electrochemical signals called action potentials to its connected neighbors. A neuron can serve an excitatory role, amplifying and propagating signals it receives, or an inhibitory role, suppressing signals instead.

Populations of interconnected neurons that are smaller than neural networks are called neural circuits. Very large interconnected networks are called large scale brain networks, and many of these together form brains and nervous systems.

Signals generated by neural networks in the brain eventually travel through the nervous system and across neuromuscular junctions to muscle cells, where they cause contraction and thereby motion.

{{cite book

|last1 = Levitan

|first1 = Irwin

|last2 = Kaczmarek

|first2 = Leonard

|title = The Neuron: Cell and Molecular Biology

|chapter = Intercellular communication

|publisher = Oxford University Press

|edition = 4th

|date = August 19, 2015

|location = New York, NY

|pages = 153–328

|isbn = 978-0199773893

}}

In machine learning

{{main|Neural network (machine learning)}}

File:Neural network example.svg artificial neural network]]

In machine learning, a neural network is an artificial mathematical model used to approximate nonlinear functions. While early artificial neural networks were physical machines, today they are almost always implemented in software.

Neurons in an artificial neural network are usually arranged into layers, with information passing from the first layer (the input layer) through one or more intermediate layers (the hidden layers) to the final layer (the output layer).

{{Cite book

|last=Bishop |first=Christopher M.

|title=Pattern Recognition and Machine Learning

|date=2006-08-17

|publisher=Springer

|isbn=978-0-387-31073-2

|location=New York

|language=English

}}

The "signal" input to each neuron is a number, specifically a linear combination of the outputs of the connected neurons in the previous layer. The signal each neuron outputs is calculated from this number, according to its activation function. The behavior of the network depends on the strengths (or weights) of the connections between neurons. A network is trained by modifying these weights through empirical risk minimization or backpropagation in order to fit some preexisting dataset.

{{Cite book

|last1=Vapnik |first1=Vladimir N.

|last2=Vapnik |first2=Vladimir Naumovich

|title=The nature of statistical learning theory

|date=1998

|publisher=Springer

|isbn=978-0-387-94559-0

|edition=Corrected 2nd print.

|location=New York Berlin Heidelberg

}}

The term deep neural network refers to neural networks that have more than three layers, typically including at least two hidden layers in addition to the input and output layers.

Neural networks are used to solve problems in artificial intelligence, and have thereby found applications in many disciplines, including predictive modeling, adaptive control, facial recognition, handwriting recognition, general game playing, and generative AI.

History

{{see also|Biological neural network#History|History of artificial neural networks}}

The theoretical base for contemporary neural networks was independently proposed by Alexander Bain in 1873

{{cite book

|last=Bain

|title=Mind and Body: The Theories of Their Relation

|year=1873

|publisher=D. Appleton and Company

|location=New York

}} and William James in 1890.

{{cite book

|last=James

|title=The Principles of Psychology

|url=https://archive.org/details/principlespsych01jamegoog|year=1890

|publisher=H. Holt and Company

|location=New York

}} Both posited that human thought emerged from interactions among large numbers of neurons inside the brain. In 1949, Donald Hebb described Hebbian learning, the idea that neural networks can change and learn over time by strengthening a synapse every time a signal travels along it.

{{Cite book

|last=Hebb |first=D.O.

|title=The Organization of Behavior

|publisher=Wiley & Sons

|location=New York

|year=1949

}}

Artificial neural networks were originally used to model biological neural networks starting in the 1930s under the approach of connectionism. However, starting with the invention of the perceptron, a simple artificial neural network, by Warren McCulloch and Walter Pitts in 1943,

{{cite journal

|last1=McCulloch |first1=W

|last2=Pitts |first2=W

|title=A Logical Calculus of Ideas Immanent in Nervous Activity

|journal=Bulletin of Mathematical Biophysics

|date=1943

|volume=5 |issue=4

|pages=115–133

|doi=10.1007/BF02478259

|url=https://www.bibsonomy.org/bibtex/13e8e0d06f376f3eb95af89d5a2f15957/schaul

}} followed by the implementation of one in hardware by Frank Rosenblatt in 1957,

{{cite journal

|last=Rosenblatt |first=F.

|year=1958

|title=The Perceptron: A Probabilistic Model For Information Storage And Organization In The Brain

|journal=Psychological Review

|volume=65 |issue=6

|pages=386–408

|citeseerx=10.1.1.588.3775

|doi=10.1037/h0042519

|pmid=13602029

|s2cid=12781225

}}

artificial neural networks became increasingly used for machine learning applications instead, and increasingly different from their biological counterparts.

See also

References