Data Mining in the Cloud Part 3

Stephen A. Broeker

May 6, 2016

Computer Engineering

In my first Data Mining blog post, I defined the concept of the Data Mining Stack, that the Data Mining Stack consists of three layers. The top layer is comprised of Data Mining Algorithms. The middle layer consists of the OLAP HyperCube. And the bottom layer is defined by the Ingest DataBase.

In the second Data Mining post, I discussed how the Data Mining Stack fits into the Cloud. And we want to implement it in the Cloud in the first place.

Stephen A. Broeker

May 6, 2016

Computer Engineering

Data Mining in the Cloud Part 2

Stephen A. Broeker

May 6, 2016

Computer Engineering

In my last Data Mining post, I defined the concept of the Data Mining Stack and that it consists of three layers. The top layer is comprised of Data Mining Algorithms. The middle layer consists of the OLAP HyperCube, and the bottom layer is defined by the Ingest DataBase. So how does the Data Mining Stack fit into the Cloud? And why would we want to implement it in the Cloud in the first place?

Stephen A. Broeker

May 6, 2016

Computer Engineering

Data Mining in the Cloud – Part 1

Stephen A. Broeker

May 6, 2016

Computer Engineering

Data Mining in the cloud is a hot topic nowadays. As such, I’d like to define the Data Mining Stack and how it fits in the Cloud.

Data Mining Stack & The Cloud

Firstly, I will define the Data Mining Stack to consist of three layers: Algorithms (Heuristics), Hyper Cube, and Ingest DataBase. The first layer (Algorithms) consists of mathematical methods that draw conclusions from data sets. Some of the more common methods are: Gaussian Elimination, Neural Network, and Rule Association. It is of the utmost importance to note that deriving inferences is based on the field of statistics. And statistics gets its power from large data sets. It does not make sense to base decisions on small data sets. Small data sets are random and thus insignificant.