272x Filetype PDF File size 2.17 MB Source: cin.ufpe.br
D
a
t
a
M
i
ni
ng Cours
e
INTRODUCTION TO DATA - U
F
P
E
MINING: - J
une
DATA PREPROCESSING 2012
1 Chiara Renso
KDD-LAB
ISTI- CNR, Pisa, Italy
chiara.renso@isti.cnr.it
WHAT IS DATA?
Collection of data objects and Attributes
their attributes
An attribute is a property or
characteristic of an object
– Examples: eye color of a
person, temperature, etc.
– Attribute is also known as
variable, field, characteristic,
or feature Objects
A collection of attributes
describe an object
– Object is also known as record,
point, case, sample, entity, or
instance
TYPES OF ATTRIBUTES
There are different types of attributes
– Nominal
Examples: ID numbers, eye color, zip codes
D
a
t
– Ordinal a
M
i
Examples: rankings (e.g., taste of potato chips on a scale from ni
1-10), grades, height in {tall, medium, short} ng Cours
– Interval e
- U
Examples: calendar dates, temperatures in Celsius or F
P
E
Fahrenheit. - J
une
– Ratio 2012
Examples: temperature in Kelvin, length, time, counts
3
DISCRETE AND CONTINUOUS ATTRIBUTES
Discrete Attribute
– Has only a finite or countably infinite set of values
– Examples: zip codes, counts, or the set of words in a collection
of documents D
a
t
a
– Often represented as integer variables. M
i
– Note: binary attributes are a special case of discrete attributes ni
ng Cours
e
Continuous Attribute - U
F
– Has real numbers as attribute values P
E
– Examples: temperature, height, or weight. - J
une
– Practically, real values can only be measured and represented 2012
using a finite number of digits.
– Continuous attributes are typically represented as floating-
point variables.
4
no reviews yet
Please Login to review.