第一章 Date and Statistics
1.2 Data
度量数据的维度(Scales of Measurement),主要分为以下几类:nominal scale(字面量的,直接通过属性描述就能识别出价值信息), ordinal scale (the data can be
ranked, or ordered, with respect to the service quality),
interval scale (Interval data are always numeric), ratio scale.
总结:
nominal 数据间是独立的,无关联性。
ordinal 数据间是具有关联性的。需要进行比较。
interval 在ordinal 基础上进行进一步测量。
ratio scale 在interval 基础上进行进一步测量。
Categorical (分类)and Quantitative(量化) Data
The statistical method appropriate for summarizing data depends upon whether the data are categorical or quantitative.
Categorical data use either the nominal or ordinal scale of measurement.
Quantitative data are obtained using either the interval or ratio scale of measurement.
even when the categorical data are identified by a numerical code, arithmetic operations such as addition, subtraction, multiplication, and division do not provide meaningful results.
Arithmetic operations provide meaningful results for quantitative variables. more alternatives for statistical analysis are possible when data are quantitative.
总结:
在可分类的和量化数据对比中,量化数据的对于统计分析意义更大。
Cross-Sectional (横截面,在一个时间切面上)and Time Series (具有大的时间跨度)Data。
具有大的时间跨度的数据对应统计分析意义更大。
1.3 Data Sources
Data can be obtained from existing sources or from surveys and experimental studies designed to collect new data.
Statistical Studies
Statistical studies can be classified as either experimental or observational.
In an experimental study, a variable of interest is first identified.
Nonexperimental, or observational, statistical studies make no attempt to control the variables of interest. A survey is perhaps the most common type of observational study. For instance, in a personal interview survey, research questions are first identified
Data Acquisition Errors
Data analysts also review data with unusually large and small values, called outliers, which are candidates for possible data errors.
异常值(outliter)的定义。
1.4 Descriptive Statistics
The most common numerical descriptive statistic is the average, or mean.