Author Archives: Victor Dey - Page 27

15 Sep

All You Need to Know about Data Fabric

image-26341
image-26341
Data Fabric uses existing metadata assets to support the design, deployment and proper utilization of data across all environments and platforms. The concept aims to accelerate the inference of insights from data through several different automated processes.
15 Sep

What Is Graph Analytics & Its Top Tools

image-26328
image-26328
Graph analytics, also known as Graph Algorithms, are analytic tools that are used to analyze relations and determine strength between the entities present in an organization such as products, customers and services, where these relationships are depicted in the form of a graph. 
12 Sep

Understanding the Importance of Data Cleaning and Normalization

image-26203
image-26203
Data Cleaning is a critical aspect of the domain of data management. The data cleansing process involves reviewing all the data present within a database to either remove or update information that is incomplete, incorrect or duplicated and irrelevant. Data cleansing is just not simply about erasing the old information to make space for new data, but the process is about rather finding a way to maximize the dataset’s accuracy without necessarily tampering with the data available. Data Cleaning is the process of determining and correcting the wrong data. Organizations rely on data for most things but only a few properly address the data quality. 
05 Sep

Beginners Guide to Self-Organizing Maps

image-25973
image-25973

A self-organizing map is also known as SOM and it was proposed by Kohonen. It is an unsupervised neural network that is trained using unsupervised learning techniques to produce a low dimensional, discretized representation from the input space of the training samples, known as a map and is, therefore, a method to reduce data dimensions.

The post Beginners Guide to Self-Organizing Maps appeared first on Analytics India Magazine.

05 Sep

How To Address Bias-Variance Tradeoff in Machine Learning

image-25885
image-25885

Bias and variance are inversely connected and It is nearly impossible practically to have an ML model with a low bias and a low variance. When we modify the ML algorithm to better fit a given data set, it will in turn lead to low bias but will increase the variance. This way, the model will fit with the data set while increasing the chances of inaccurate predictions. The same applies while creating a low variance model with a higher bias. Although it will reduce the risk of inaccurate predictions, the model will not properly match the data set. Hence it is a delicate balance between both biases and variance. But having a higher variance does not indicate a bad ML algorithm. Machine learning algorithms should be created accordingly so that they are able to handle some variance. Underfitting occurs when a model is unable to capture the underlying pattern of the data. Such models usually present with high bias and low variance. 

The post How To Address Bias-Variance Tradeoff in Machine Learning appeared first on Analytics India Magazine.

05 Sep

Understanding the AUC-ROC Curve in Machine Learning Classification

image-25883
image-25883

AUC-ROC is the valued metric used for evaluating the performance in classification models. The AUC-ROC metric clearly helps determine and tell us about the capability of a model in distinguishing the classes. The judging criteria being - Higher the AUC, better the model. AUC-ROC curves are frequently used to depict in a graphical way the connection and trade-off between sensitivity and specificity for every possible cut-off for a test being performed or a combination of tests being performed. The area under the ROC curve gives an idea about the benefit of using the test for the underlying question. AUC - ROC curves are also a performance measurement for the classification problems at various threshold settings. 

The post Understanding the AUC-ROC Curve in Machine Learning Classification appeared first on Analytics India Magazine.