Specifies the type of summary statistic. Summary statistics are a single number representation of the characteristics of a set of values.
Version 2.0 incorporates two new codes, PercentageOfValidCases, and PercentageOfInvalidCases, that were added to make the list more complete and enhance usability.
ADDED CODES: PercentageOfValidCases, PercentageOfInvalidCases.
DDI 3.2
A classification of the type of summary statistic provided. Supports the use of an external controlled vocabulary. DDI strongly recommends the use of a widely shared controlled vocabulary to support interoperability.
Module Name
Element Name
physicalinstance
TypeOfSummaryStatistic
DDI 2.5
This vocabulary cannot be used for the element sumStat (4.3.14) in DDI 2.1, because this element already comes with a hard-coded controlled vocabulary in the "type" attribute. For using in DDI 2.5, select value "other" in the "type" attribute, and insert the appropriate value from the external CV in the "otherType" attribute. Use the complex element controlledVocabUsed (in the docDscr section) to identify the controlled vocabulary to which the selected term belongs.
Element Number in DDI 2.1
Element/Attribute Name
4.3.14
sumStat
Creative Commons Attribution-ShareAlike 3
http://creativecommons.org/licenses/by-sa/3.0/
http://i.creativecommons.org/l/by-sa/3.0/80x15.png
Copyright ©
DDI Alliance
http://www.ddialliance.org/
2014
SummaryStatisticType
Summary Statistic Type
2.0
urn:ddi-cv:SummaryStatisticType
urn:ddi-cv:SummaryStatisticType:2.0
http://www.ddialliance.org/Specification/DDI-CV/SummaryStatisticType_2.0_Genericode1.0_DDI-CVProfile1.0.xml
http://www.ddialliance.org/Specification/DDI-CV/SummaryStatisticType_2.0.html
http://www.ddialliance.org/Specification/DDI-CV/SummaryStatisticType_2.0_InputSheet_Excel2003.xls
DDI Alliance
The Alliance for the Data Documentation Initiative
DDI
Code
Value of the Code
Term
Descriptive Term of the Code
Definition
Definition of the Code
CodeKey
The unique identification of each item in a code list.
ArithmeticMean
Arithmetic mean (X)
Mathematical average of a set of values. The mean is calculated by adding up two or more values and dividing the total by their number. In social/political science, it is usually the sum of the measurements divided by the number of subjects, or cases.
GeometricMean
Geometric mean
Average value of all data if extracting the nth root of the product of all (n) values. Rarely used in social sciences.
HarmonicMean
Harmonic mean
Average value of all data if calculating the reciprocal of the arithmetic mean of the reciprocal of values. Rarely used in social sciences.
TrimmedMean
Trimmed mean
The (arithmetic) mean calculated after discarding given parts of observations at the high and low end (e.g., interquartile mean when the lowest 25% and the highest 25% are discarded, and the mean of the remaining values is calculated).
StandardErrorOfMean
Standard error of the mean
The Standard Error for the mean value.
Mode
Mode (Mo)
The most frequently observed data value (Statistics Canada).
Median
Median (Mdn)
The values below which, and above which, half of the values in a distribution fall (50th percentile).
ValidCases
Valid Cases
Cases with observations which are considered to be valid, i.e., providing substantial information and to be included for calculation.
InvalidCases
Invalid cases
Cases which are considered/defined as "missing" (e.g., not ascertained, not applicable, etc.), usually excluded from calculation.
Minimum
Minimum
The lowest valid value in a variable.
Maximum
Maximum
The highest valid value in a variable.
Range
Range
The range of valid values, i.e., all values that fall between the lowest and highest valid values.
Sum
Sum
The sum or total of the values, across all valid cases.
Variance
Variance (s2)
The variance is the mean square deviation of the variable around the average value. It reflects the dispersion of a frequency distribution around its mean (OECD Glossary of Statistics).
StandardDeviation
Standard deviation (s)
The positive square root of the variance. The most widely used measure of dispersion of a frequency distribution.
CoefficientOfVariation
Coefficient of variation (CV)
Standard deviation divided by the mean.
AverageAbsoluteDeviation
Average absolute deviation (AAD)
The average of the absolute differences between each value and the overall mean. Measure of statistical dispersion around the mean, alternative to Standard Deviation.
MedianAbsoluteDeviation
Median absolute deviation (MAD)
The median absolute deviation from the median. Measure of statistical dispersion around the median.
FirstQuartile
First quartile
The first of three values which separate the total frequency of a distribution into four equal parts.
Second Quartile
Second quartile
The second of three values which separate the total frequency of a distribution into four equal parts (= median).
ThirdQuartile
Third quartile
The third of three values which separate the total frequency of a distribution into four equal parts.
InterquartileRange
Interquartile range
The range between the first and third quartile values.
FirstQuintile
First quintile
The first of four values which separate the total frequency of a distribution into five equal parts.
SecondQuintile
Second quintile
The second of four values which separate the total frequency of a distribution into five equal parts.
ThirdQuintile
Third quintile
The third of four values which separate the total frequency of a distribution into five equal parts.
FourthQuintile
Fourth quintile
The fourth of four values which separate the total frequency of a distribution into five equal parts.
InterquintileRange
Interquintile range
The range between the first and fourth quintile values.
FirstDecile
First decile
The first of nine values which separate the total frequency of a distribution into ten equal parts.
SecondDecile
Second decile
The second of nine values which separate the total frequency of a distribution into ten equal parts.
ThirdDecile
Third decile
The third of nine values which separate the total frequency of a distribution into ten equal parts.
FourthDecile
Fourth decile
The fourth of nine values which separate the total frequency of a distribution into ten equal parts.
FifthDecile
Fifth decile
The fifth of nine values which separate the total frequency of a distribution into ten equal parts (= median).
SixthDecile
Sixth decile
The sixth of nine values which separate the total frequency of a distribution into ten equal parts.
SeventhDecile
Seventh decile
The seventh of nine values which separate the total frequency of a distribution into ten equal parts.
EighthDecile
Eighth decile
The eighth of nine values which separate the total frequency of a distribution into ten equal parts.
NinthDecile
Ninth decile
The ninth of nine values which separate the total frequency of a distribution into ten equal parts.
InterdecileRange
Interdecile range
The range between the first and ninth decile values.
OtherPercentile
Other percentile
A percentile not covered by any of the other percentile terms.
Beta1
Skewness
A measure for the asymmetry of a probability distribution of a variable.
Beta2
Kurtosis
A measure for the "peakedness" of a probability distribution of a variable.
ShapiroWilk
Shapiro-Wilk
Normality test statistics.
PercentageOfValidCases
Percentage of valid cases
Indicates the percentage of valid cases of the total number of cases.
PercentageOfInvalidCases
Percentage of invalid cases
Indicates the percentage of invalid cases of the total number of cases.
Other
Other
Use if the summary statistic type is known, but not found in the list.