Pruning The Tree – Dimensionality Reduction In Machine Learning Neural Networks
It’s easy to forget that a machine learning neural network isn’t built to ingest everything that you throw at it.
Curating ingestion data is something of a lost art these days. There’s an assumption that relevance filtering will happen ‘somewhere along the way’. But the truth is that dimensionality reduction in machine learning neural networks starts in the planning phase. Establishing a sane data exposure rule set should be one of the first design decisions, and ingestion data should be curated to fit the established goals as much as humanly possible.
With that in mind, let’s look at ways we can prune this tree, and achieve the kind of dimensionality reduction that will prevent bloat, save time, and allow you to spend your energy on the things that matter.
Why Engage In Dimensionality Reduction For Machine Learning Neural Networks?
Performance: The curse of dimensionality is real. In machine learning neural network terms, this is known as Hughes’ Phenomenon. To maintain statistical relevance and confidence, training samples for hundreds or thousands of discrete training pixels are required. This training bloat hits on every front: Bandwidth, processing power, memory allocation, and DB complexity. Dimensionality reduction attempts to get a handle on this by eliminating irrelevant data.
Focus: The amount of coding hours that go into any project is finite. The more time that is spent trying to ‘see the whole landscape’, the less time there is to actually accomplish business and research directives. Dimensionality reduction increases focus on the things that really matter.
Storage: Every scope-based decision that impacts storage has a potential threefold rippling effect. Active storage needs to be considered as the machine learning neural network crunches the data. Backup storage is required for any project that matters enough to resurrect if there’s a crash. And finally, cold storage needs to be allocated if the project has business continuity or disaster recovery needs and implications. Costs for storage can quickly spiral out of control if the project’s scope creeps.
Avoid Lawsuits: When data ingestion is overly broad, you risk absorbing personal data that might be protected by strict government statutes. Worse, even if collecting that data is legal today, storing that data might become illegal tomorrow. When a data collection law changes, there are rarely ‘grandfather’ clauses. Is all of your training data properly categorized and labeled, or is it just sitting in containers waiting to be parsed? Would you even realize that you had such training data if a law were to suddenly change? Reducing dimensionality to include as little personal data as possible is a wise decision.
Dimensionality Reduction Techniques
Autoencoders: This is the first technique to consider, simply because it falls into the already-familiar theme of unsupervised learning. Autoencoders engage in data labeling and pruning tasks. They attempt to remove noise by contextually compressing what they deem as ‘significant’ data, and then uncompressing it without context to see if they can accurately replicate the source data. If they can repeatedly get all of the significant results while ignoring entire categories of data, they know exactly what dimensions can be reduced safely.
Algebraic Formulation: Performing an analysis on ingestion data, it is broken down into its constituent parts. A determination is made as to how important each type of input is to the end result, which forms a weight matrix. As part of this process, many components will be assigned a weight of ‘0’. They become irrelevant. All other ingestion data has its importance multiplied by its ‘weight’, before being fed into the main matrix. This is a great way to provide some oversight into the ingestion process without getting too heavy handed.
Manifold Learning: This is one of those subjects that’s so complex, an entire book could be written about it. In fact, it has. But briefly, higher dimension attributes in the training dataset are reduced to simpler elements called ‘intrinsic variables’. Let’s say images from a 10,000 ft altitude flyover are compiled over a year’s time. If one were to treat it as essentially the same image, what variables might one apply to it to account for any changes? Maybe two; ‘angle of the photo’ and ‘time passed’. These lower dimension latent manifolds are then used to visualize the results and prune out aspects that ultimately prove to be duplicates that can be replaced with a link to a constant. This constant only needs to be stored once and (much of the time) requires no further analysis.
There are dozens of additional techniques that may apply to specific applications. But these three dimensionality reduction techniques for machine learning neural networks broadly cover the majority of the optimization that most projects need.
Who Wins When The Tree Is Pruned?
Certain fields benefit more than others by dimensionality reduction. The rule of thumb is: The more complex the model, the more that it benefits from pruning the ingestion data.
In other words, simple machine learning neural networks that don’t have a lot of overhead might not want to invest the time and resources to dimensionality reduction. Saving a couple hundred dollars a year on processing and storage costs is fairly meaningless if hundreds of work-hours need to go into the process.
The biggest winner is probably large scale Internet of Things (IoT) applications. When you consider a large scale application such as power grid planning, as an example, this becomes clear. Literally millions of nodes are reporting power outages, power generation from solar and wind, weather conditions that are used in predictive modeling for both renewable generation and domestic consumption, seasonal and event based consumption, and the like.
Cutting down on irrelevant noise in such an application makes predictive models more accurate, reduces overhead, determines significant monitoring points, and can help reduce global warming by determining the best location for additional renewable resources. And that’s just one example from a single industry.
The next big winner is pure science applications, and in particular search applications. When pouring through planetary data from thousands of different sensors and telescopes, for example, learning what factors are irrelevant to your research is half the battle. Dimensionality reduction in machine learning neural networks aids in the analysis of cosmic phenomenons, the determination of goldilocks planets, human habitation considerations, and much more. Similar techniques are used to simplify the analysis of everything from coastal erosion to medical outcomes.
Finally, as alluded to throughout this article, tech innovation relies heavily on dimensionality reduction to keep costs from spiraling out of control. There are applications in everything from logistics and warehousing, to self-maintaining code bases. In large applications, dimensionality reduction can save tens of millions of dollars in labor, Cloud computing costs, and storage.


