234x Filetype PDF File size 1.28 MB Source: www.dbjournal.ro
Database Systems Journal vol. IV, no. 4/2013 21
Data Mining Solutions for the Business Environment
Ruxandra PETRE
University of Economic Studies, Bucharest, Romania
ruxandra_stefania.petre@yahoo.com
Over the past years, data mining became a matter of considerable importance due to the
large amounts of data available in the applications belonging to various domains. Data
mining, a dynamic and fast-expanding field, that applies advanced data analysis techniques,
from statistics, machine learning, database systems or artificial intelligence, in order to
discover relevant patterns, trends and relations contained within the data, information
impossible to observe using other techniques.
The paper focuses on presenting the applications of data mining in the business environment.
It contains a general overview of data mining, providing a definition of the concept,
enumerating six primary data mining techniques and mentioning the main fields for which
data mining can be applied. The paper also presents the main business areas which can
benefit from the use of data mining tools, along with their use cases: retail, banking and
insurance. Also the main commercially available data mining tools and their key features are
presented within the paper.
Besides the analysis of data mining and the business areas that can successfully apply it, the
paper presents the main features of a data mining solution that can be applied for the
business environment and the architecture, with its main components, for the solution, that
would help improve customer experiences and decision-making.
Keywords: Data mining, Business, Architecture, Data warehouse
Introduction analysis a matter of significant importance
Nowadays, companies collect huge and necessity today. Data mining – the
1
volumes of data on a daily basis. analysis step within the KDD (Knowledge
Analyzing this data and discovering the Discovery in Databases) process – uses a
meaningful information contained by it diversity of advanced data analysis methods
became an essential need for businesses. to explore the data and discover useful
As the business environment develops patterns and trends.
and changes constantly, facing every day Data mining consists of applying data
new challenges, the companies try to analysis and discovery algorithms that,
strengthen their market position and under acceptable computational efficiency
achieve competitive advantage by using limitations, produce a particular
new and innovative solutions, like data enumeration of patterns (or models) over the
mining. data. [1]
Data mining solutions implement With the imminent growth of the amounts of
advanced data analysis techniques used data in every application, using data mining
by companies for discovering unexpected methods for automatically identifying valid
patterns extracted from vast amounts of and meaningful patterns in order to produce
data, patterns that offer relevant useful information and knowledge became a
knowledge for predicting future requirement for various fields including
outcomes. business, education or science and
engineering, fields for which data mining
2. General overview of data mining can fulfill the following purposes:
The availability and affluence of data Business – data mining can be applied
belonging to various domains make data in retail, banking or insurances, for
activities like customer segmentation
22 Data Mining Solutions for the Business Environment
and retention, market basket from the data mart, data warehouse and, in
analysis or fraud detection; particular cases, even from operational
Education – data mining can be databases. [2]
applied for grouping students, The data mining methods, used for
predicting student performance, extracting hidden patterns from the data, are
planning and scheduling courses or classified into the following two categories:
understanding student behavior; description methods and prediction methods.
Science and engineering – data Description methods are oriented to data
mining can be used for domains like interpretation, which focuses on
bioinformatics, astronomy, understanding (by visualization for example)
medicine, genetics, electrical the way the underlying data relates to its
power, telecommunications or parts. Prediction-oriented methods aim to
climate data. automatically build a behavioral model,
Data mining can be defined as a process which obtains new and unseen samples and
of exploring and analysis for large is able to predict values of one or more
amounts of data with a specific target on variables related to the sample. [3]
discovering significantly important Data mining analyzes the data by applying a
patterns and rules. Data mining helps wide variety of techniques, developed for
finding knowledge from raw, the efficient handling of large volumes of
unprocessed data. Using data mining data. The six primary data mining
techniques allows extracting knowledge techniques are presented below in figure 1:
Fig. 1 Data mining techniques
The main data mining techniques are prediction variable;
organized into the following categories: Clustering: is a common descriptive
[1] task where one seeks to identify a
Classification: consists of a finite set of categories or clusters to
function that maps (classifies) a describe the data;
data item into one of several Association rule learning (Dependency
predefined classes; modeling): consists of finding a model
Regression: involves a function that that describes significant dependencies
maps a data item to a real-valued between variables;
Database Systems Journal vol. IV, no. 4/2013 23
Anomaly detection (Change and applications for data mining, which have
deviation detection): focuses on improved many domains of human life.
discovering the most significant
changes in the data from previously 3. Data mining applications for business
measured or normative values; Data mining is defined as a business process
Summarization: involves methods for exploring large amounts of data to
for finding a compact description discover meaningful patterns and rules. [4]
for a subset of data. Companies can apply data mining in order
Data mining has evolved in the past two to improve their business and gain
decades, becoming a fundamental advantages over the competitors.
discovery process. It has incorporated The most important business areas that
techniques from many other fields, successfully apply data mining, presented in
including statistics, machine learning and Fig. 2 below, are:
database systems.
The diversity of data and the multitude of
data mining techniques provide various
Fig. 2 Business areas that successfully apply data mining
1. Retail Data mining techniques have many
Retail data mining can help identify applications in the retail industry, including
customer buying behaviors, discover the following:
customer shopping patterns and trends, Customer segmentation: identify
improve the quality of customer service, customer groups and associate each
achieve better customer retention and customer to the proper group;
satisfaction, enhance goods consumption Establish customer shopping behavior:
ratios, design more effective goods identify customer buying patterns and
transportation and distribution policies, determine what products the customer
and reduce the cost of business. [5] is likely to buy next;
Customer retention: identify customer
shopping patterns and adjust the
24 Data Mining Solutions for the Business Environment
product portfolio, the pricing and 3. Insurance.
the promotions offered; Data mining can help insurance firms in
Analyze sales campaigns: predict business practices such as: acquiring new
the effectiveness of a sales customers, retaining existing customers,
campaign based on the certain performing sophisticated classification or
factors, like the discounts offered or correlation between policy designing and
the advertisements used. policy selection. [7]
Retail industry offers a wide area of In insurance the data mining techniques
applications for data mining due to the have the following applications:
large amounts of data available for Risk factor identification: analyze the
companies. factors, like customer claims history or
behavior patterns, that can have a
2. Banking stronger or weaker influence over the
There are various areas in which data insured’s level of risk;
mining can be used in financial sectors Fraud detection: establish patterns of
like customer segmentation and fraud and analyze the factors that
profitability, credit analysis, predicting indicate a high probability of fraud for
payment default, marketing, fraudulent a claim;
transactions, ranking investments, Customer segmentation and retention:
optimizing stock portfolios, cash establish customer groups and include
management and forecasting operations, each new customer to the appropriate
high risk loan applicants, most profitable group and identify discounts and
Credit Card Customers and Cross Selling. packages that would increase customer
[6] loyalty.
The main examples of applications of the Data mining techniques have many
data mining techniques in the banking applications in the insurance business and
industry are the following: can improve it by analyzing the large
Credit scoring: distinguish the amounts of data available for companies.
factors, like customer payment
history, that can have a higher or 4. Data mining tools used in the business
lower influence over loan payment; environment
Customer segmentation: establish Data mining tools commercially available
customer groups and include each implement various data mining techniques
new customer in the right group; for performing advanced data analysis on
Customer retention: identify large volumes of data. The main data mining
customer shopping patterns and products, presented in Table 1 below, along
adjust the product portfolio, the with their key features, are: IBM SPSS
pricing and the promotions offered; Modeler, developed by IBM, the data
Predict customer profitability: mining tools included by Microsoft SQL
identify patterns based on various Server Analysis Services, Oracle Data
factors, like products used by a Mining, embedded within the Oracle
customer, in order to predict the database, SAS Enterprise Miner, produced
profitability of the customer. by SAS, and STATISTICA Data Miner,
The information systems for the banking developed by StatSoft.
industry contain large amounts of
operational and historical data, being a
fitted application area for data mining.
no reviews yet
Please Login to review.