With recent advances in data science and information technology, the methods by which companies and manufacturers manage data and perform operations of that data has greatly changed. The ways in which they offer products and services to the public are a result of these advances. AI and machine learning have created an environment in which the analysis, collection, and interpretation of data is readily accessible and understandable, through which customer experiences could be customized.
This revolution in data science and information technology brought up many new fields that are all a part of it, including data visualization, statistical analysis, database management, and neural networks. Of these fields, that of neural networks is particularly interesting and prominent. Neural networks consist of algorithms that, combined together, can be used to identify relationships between objects or data like a human brain would. They can model problems and predict future outcomes based on given data.
Fueled by artificial intelligence, neural networks have several uses including enterprise and business applications and other scientific and law applications.
With this, I hope to shed some light on the concept of neural networks, how they work, and briefly go over some real-world applications.
How They Work
Lets start off by trying to answer an essential question to this topic: how do neural networks work?
Artificial neural networks (ANNs for short) work by simulating how a neural network inside the brain would work. They take information as input, process it, then send a processed signal to the output. This input information can really be anything, such as numerical data, media files like videos and recordings, and anything else that a computer can process. The type of information that is accepted by an ANN depends on its purpose, since different ANNs will have different purposes. One may have the goal of learning a person’s voice, and another may have the goal of figuring out which form of a product is best suited for a customer. Obviously, ANNs can get much more complex than an input-output web, as there are lots of other variables and methods going on inside the basic structure. For instance, not all inputs are worth the same to an ANN. Different inputs have different weights, which are called \(w\) values. These values cause some information to be worth/prioritized more than others. Since ANNs are networks, they consist of an assortment of nodes connected to each that transmit and process information, with an end goal of achieving an accurate result or prediction. The \(w\) values help to reach this goal by increasing or decreasing the power of a signal to produce the final output. A more powerful signal resembles increased importance and bias toward that information coming from that node to other nodes. Less important information does not need to be processed as heavily (or contributed to the final output) as more important information. These processes are what create an ANN. Next, an ANN would be trained in order to make its results or predictions accurate. This involves the automatic, repeated adjustment of weights in the network, which improves results. Once the optimal range of weights has been achieved for each node in the ANN, it is properly trained and ready for use.
In order to fully understand how neural networks work, let’s look at a simple model of one with 3 layers.
Figure 1 shows the most basic neural network, with a total of 3 layers of artificial neurons. The first layer consists of input nodes, which handle taking in (but not processing) the input, and the last one is the output layer, which has output-producing nodes. However, the middle layer is the more interesting one because of its function. The hidden layer serves to identify what information is important and what is not. The w values that were mentioned before are assigned and used here to do all the information processing, and finally send the signal to the output nodes. The more important signals are amplified and passed on to the outputs, while redundant or unnecessary information is left out. Whether this information is important is determined by the weight values that were assigned. Essentially, input data in handled in the hidden layer. Since there is one neuron in the output in this example, all the information in the hidden layer is compiled into a single output. This can get much more complex, as there are no shown w values and many other algorithms that would generally be implemented.
In reality, industry standard ANNs are much more complicated than feed-forward webs, but it would be too much to explain more than just the concept here. This is because algorithms like a back propagation algorithm combined with weight values allow for the training and modeling of ANNs, letting them learn from errors. This algorithm perfects the weight values that were originally assigned, in 4 different steps. First, it calculates the error value produced with the current w values, then checks if this error value is minimized, then updates the w values, and repeats the process until all errors are fully minimized (as small as possible). Furthermore, something like a gradient descent algorithm can be used to make ANNs more accurate in operation, as different weights are used to produce different results, after which errors are evaluated. Over several runs of the ANN and algorithm, weights are continuously changed slightly to determine the best weights until the optimal value is reached. Because these are the minimums within local regions of the ANN, it is considered a local minimum, rather than a global, or absolute minimum. This optimal weight range is a range in which changing the weights does not decrease the error in outputs in the nodes’ respective regions, meaning that the error has been minimized. It is this optimal weight range that indicates that indicates that training is (mostly) done, and that the ANN is ready for use in future predictions/operations.
However, sometimes, if the newly trained model becomes efficient in using and processing the data in the training set but is not as good at classifying data that it was not necessarily trained on, overfitting can occur, thus causing the model to overfit the data in the training data. This can be reduced by adding more data training set, thus allowing the model to learn more. This allows the model to recognize a more diverse range of data, thus decreasing overfitting and increasing the accuracy of predictions. Even more, something called data augmentation can be implemented, where data is reasonable modified to support the range of data that is needed in the ANN. Once a satisfactory result is produced from weight adjustment and overfitting is minimized, the ANN will be ready for use in real-world applications.
Now, lets briefly look at a few of the real-world applications of artificial neural networks. In general, there are 3 categories that ANNs could be put in based on method, which are classification, time series, and optimization. Classification ANNs involve performing binary decision or multiple-class identification. These ANNs can be used in practical situations like credit card fraud detection. In fact, this sort of ANN has already been used in places like the First USA Bank, Mellon Bank, and many others. In addition, they can also be used for bomb detection using thermal neutron analysis. The second category, time series, involves ANNs that predict future outcomes or data based on previous/historical data. Solutions to problems involving time series are applicable to various real-life situations, including things like weather forecasting and foreign exchange trading systems in banks like Citibank London. Another rather interesting real-world application of this time series category of ANNs happens to be quite specific, such as identifying dementia from electrical activity in the brain. Here, an electrode-electroencephalogram was produced through the use of an ANN and was actually more efficient and accurate that other methods like Z statistic and discriminant analysis. Finally, the third type of category of ANN, optimization, involves a different strategy because of the fact that they produce approximations to the solutions of non-polynomial hard problems. They have former data to learn from and are unsupervised. Examples of this include scheduling in manufacturing and telecommunication routing.
This was a very simple introduction to the concept of neural networks. ANNs are extremely useful and trending in business applications, and even consumer applications. Hopefully this information has given you an idea of how data is used in various ares to process and decide things, as well as some interest into the growing field of data science and machine learning.
Ahire, J. B. (2018, April 10). Real world Applications of Artificial Neural Networks. Medium. https://medium.com/@jayeshbahire/real-world-applications-of-artificial-neural-networks-a6a6bc17ad6a.
Chen, J. (2020, September 24). Neural Network. https://www.investopedia.com/terms/n/neuralnetwork.asp.
Mahanta, J. (2017, July 12). Introduction to Neural Networks, Advantages and Applications. Medium. https://towardsdatascience.com/introduction-to-neural-networks-advantages-and-applications-96851bd1a207.
Team, T. A. I. (2020, August 28). Main Types of Neural Networks and its Applications - Tutorial. https://medium.com/towards-artificial-intelligence/main-types-of-neural-networks-and-its-applications-tutorial-734480d7ec8e.
Edureka. (2020, April 24). What is backpropagation?: Training a neural network. http://www.edureka.co/blog/backpropagation/.
Techopedia. (2018, April 12). What is a Hidden Layer? Techopedia.com. http://www. techopedia.com/definition/33264/hidden-layer-neural-networks.
DeepLizard. Overfitting in a Neural Network explained. deeplizard.https://deeplizard.com/learn/video/DEMmkFC6IGM.