Data of Statistics, Machine Learning and Data Base Management

Data mining is the process of analyzing
data from different perspectives and summarizing it into useful information.
Data mining software is one of the numbers of analytical tools for analyzing
data. It allows users to analyze data from many different dimensions or angles,
categorize it, and summarize the relationships identified. The aim of data
mining is to discover structure inside unstructured data, extract meaning from
noisy data, discover patterns in apparently random data, and use all this
information to better understand trends, patterns, correlation in existing
data.

 

One of the most important tasks in
Data Mining is to select the correct data mining technique. Data mining
technique has to be chosen based on the type of business and the type of problem
your business faces. A generalized approach has to be used to improve the
accuracy and cost effectiveness of using data mining techniques. More
frequently used techniques are clustering, classification, association,
decision tree, neural network. In this paper we discus about the clustering
techniques briefly.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

 

Data mining is a technology that consists
traditional data analysis methods with sophisticated algorithms for processing
large volumes of data through which useful information and knowledge are extracted.
Data mining is the process of extracting or mining Information from huge
amounts of data. It is also known as knowledge mining from data. Data Mining is
the method of extracting information from large data sets through the use of
algorithms and techniques drawn from the field of Statistics, Machine Learning
and Data Base Management Systems.

 

Clustering is unsupervised learning it
deals with finding structure in a collection of unlabeled data. These models
are sometimes called descriptive model. Clustering is one of the most common
untested data mining methods that explore the hidden structures in a dataset.
Data mining approaches are useful bioinformatics. In bioinformatics Gene
expression, DNA sequence, RNA sequence and protein sequence clusters is known
as clustering The data objects which are more similar in nature are clustered
into same cluster while the data objects which are highly dissimilar lie in
separate clusters. Fundamentally, clustering is an unsupervised classification
where the class label is unknown.

 

There are several major data mining techniques have
been developed and used in data mining projects recently including association,
classification, clustering, prediction and sequential patterns etc.,