The application of statistics in research is well documented. Before choosing a statistical method for your own research project, knowledge regarding scales of measurement is a prerequisite. Scales of measurement per se have to do with the allocation of numerical values to characteristics according to certain rules. Measurement can thus either be quantitative or qualitative. The quantitative level of measurement includes among other things, aspects such as interpretation and paragraph analysis, whilst the quantitative level of measurement focuses on measures such as nominal, ordinal, internal and ratio levels of measurement. The latter are basic scales of measurement and will be briefly outlined.
1 Measuring scales
* Nominal measurement
Nominal measurement includes the awarding of a numeral value to a specific characteristic. Tr~is type of measurement is the most basic form of measurement, because it measures the lowest level that can be measured and is therefore considered a scale of measurement with limitations. The following serves as an example of nominal measurement: A researcher wants to determine the profile of the academic background of his students. For this he/she might need information regarding the specific level (HG, SD, LG) his students passed during their matriculation examination. The different matriculation subjects would then be listed i.e. Mathematics, English, Geography etc. To each subject passed on the higher grade a numerical value 1 will be allocated, and the numerical value 2 for a pass in the standard grade and so forth. The numerical value 1 does not mean half of 2, it merely indicates a difference.
This scale of measurement is used mainly for the compiling of frequency tables (Smit 1983: 208).
* Ordinal measurement
Ordinal measurement is applicable in cases where a criterion/characteristic is awarded to numeral value in terms of a specific order (thus the name of the scale).
The ordinal scale implies that the entity being measured is quantified in terms of either higher or lower, greater (more) or lesser (less) without specifying the size of the intervals (Leedy 1993: 38). The numerical 1 can be the highest, whilst 3 could be the lowest. The following serves as an example of measurement on ordinal level: A school wants to select the best student of the year. After the evaluators (in this case the teachers) have nominated the best students, the finalists should be placed in rank order according to the set criteria in order to determine the best student. In this case the numeral 1 is an indication of the best student, and therefore the winner. The ordinal scale of measurement can be applied to determine the median, percentage, rank order, correlations and percentile (Smith 1983: 209).
* Interval measurement
The interval scale of measurement is characterized by two features, namely:
- equal units of measurement (equal intervals); and
- a zero point which has been established arbitrarily (Leedy 1993: 38).
The latter indicates that there is not an absolute zero point. There is therefore a specific: relationship between the distance (interval) of the numerical value and the different sizes of a characteristic. Because of the before mentioned characteristics, this measure scale is considered to be a more advanced type of measuring scale. An increase or decrease of the one characteristic goes hand in hand with an increase or decrease of the other. The interval level of measurement enables the researcher to compromise between aspects and to indicate clearly how much more the one has of a characteristic than the other. As an example of this type of measurement the following is given: an individual intelligence score is 110 whilst another is 100. The difference between these scores is exactly 10. It should also be remembered that intelligence measurement has no zero point. A second example of interval measurement is the Lickert attitude scale.
The interval scale of measurement is therefore suitable to calculate arithmetic mean averages, do standard deviations and determine correlation studies, provided that the researcher takes care that the preconditions set for each scale of measurement, are abided by.
* Ration measurement
This is considered the highest order of measurement that exist, because of the fixed proportions (ratio) between the number (numerical) and the amount of the characteristic; that it represents. What should be mentioned is that, when ration levels are measured, a fixed (absolute) zero point exists. Ration level of measurement thus enables researchers to determine whether aspects possess something of a characteristic or not. The following can serve as example of ration level of measurement: The average weight of a gymnast is 55 kilograms. On the other hand, the average weight of a rugby player is 75 kilograms. Kilograms are expressed in constant units, and a zero point does exist, because "no weight" can be determined. Because scores on this type of measuring scale possess absolute values, any type of arithmetical calculations are allowed.
2 Characteristics of measuring scales
With any type of measurement, two considerations are important - validity on the one hand and reliability on the other hand.
Reliability is the term used to deal with accuracy. A scale measurement is considered reliable if it measures that which it is supposed to measure. Further refining of the term reliable is that, when a test is repeated by the same researcher i.e. with a different group representing the original group, the same results should be obtained .
Validity on the other hand is concerned with the soundness and the effectiveness of the measuring instrument (Leedy 1 993: 40).
From the literature consulted four types of validity stands out, namely:
- content validity,
- prognostic validity,
- simultaneous validity; and
- construct validity.
Content validity deals with the accuracy with which an instrument measures the factors or content of the course or situations, under study.
Prognostic validity on the other hand relies basically upon the possibility to make judgements by virtue of results obtained by the instrument. The judgement is future orientated. Consider the following as an example: it can be predicted from the matriculation results of a prospective student that he would be a successful medical student.
Simultaneous validity is tested by comparing one measuring instrument to another one that measures the same characteristic and which is available immediately. The second, as a criteria, checks the accuracy of the first measure and sets a standard against which to measure the results. The data of the measuring instrument should correlate with equivalent data of the criterion.
Construct validity is interested in the degree to which the construct is actually measured.
3 Classification of statistical methods
Before a researcher can use a statistical method for his research, he should be familiarized with the various statistical methods as well as the prerequisites for the implementation thereof. Because of the circumference of statistical methods, an in-depth discussion cannot take place for the purpose of this element. It will suffice to highlight the basic statistical methods.
Statistical methods in the broadest sense are classified into two main groups namely descriptive and inferential statistics.
* Descriptive statistics
Smit (1983: 212) sees descriptive statistics as the formulation of rules and procedures according to which data can be placed in useful and significant order. Landman (1988: 94) states that descriptive statistics deals with the central tendency, variability (variation) and relationships (correlations) in data that are readily at hand. The basic principle for using descriptive statistics is the requirement for absolute representation of data.
The most important and general methods used are:
- Ratios. This indicates the relative frequency of the various variables to one another, for example 1.
- Percentages. Percentages (%) are calculated by multiplying a ratio with 100. In other words it is a ratio that represents a standard unit of 100.
- Frequency tables. It is a means to tabulate the rate of recurrence of a specific measurement, for example a specific achievement in a test. Data arranged in such a manner is known as distribution. If the distribution tendency is large, larger class intervals are used in order to acquire a more systematic and orderly system.
In order to understand and interpret a frequency table, you are referred to Huysamen (1976: 24).
* The histogram. The histogram is a graphic representation of frequency distribution and is being used to represent simple frequency distribution. Characteristic is a vertical line (the y axis/ordinate) at the left sideline of the figure and the horizontal line (x axis) at the bottom. The two lines meet at a 90 grade angle.
Because frequencies should be divided into class intervals, the benefit of graphic presentation is that data can be observed immediately.
* Frequency polygon. The frequency polygon does not differ basically from the histogram, but is only used for continual data. Instead of drafting bars for the complete histogram, a dot indicating the highest score is placed in the middle of the class interval. When the dots are linked up, the frequency polygon is formed. Usually an additional class is added to the end of the line in order to form an anchor.
* Cumulative frequency curve. The frequency on the frequency table is added, starting from the bottom of the class interval, and adding class by class. The cumulative frequency in a specific class interval can then clearly indicate how many persons/ measurements perform below or above the class intervals. In other words, from cumulative frequency tables a curve can be drawn, to reflect data in a graphic manner (Smit 1983: 213).
wpe3.jpg (18580 bytes)
* Percentile curve. The cumulative frequency can also be converted into percentages or proportions of distribution. From such a table, one could read certain percentages or promotions of persons or cases, with regards to a certain point on the scale. The scale value in which 10% of the score in a distribution falls, is regarded as the P10 (10 percentile). Those in which 25% of the score falls is the first quarter of P25 etc. (Smit 1983: 213).
wpe4.jpg (15279 bytes)
* Line graphic. During the previous graphic presentations the historical line (X axis) indicated the scale of measurement, whilst the vertical line (Y axis) indicated the frequency. In the case of a line graphic, both axes (X and Y) are used to indicate the scale of measurement with the aim of indicating a comparison between two comparable variables (Smit 1983: 214).
wpe5.jpg (13539 bytes)
4 Central tendency
Central tendency is defined as the central point around which data revolve. The following techniques can be employed:
* The mode
The mode is defined as the score (value or category) of the variable which is observed most frequently. For example:
3 7 5 8 6 4 5 9 5
From the above mentioned, the mode equals 5 because 5 appears to be the most frequent score amongst all the numbers (occurred 3X).
The median indicates the middle value of a series of sequentially ordered scores. Because the median divides frequencies into two equal parts, it can also be described as being the fiftieth percentile.
10 13 14 15 18 19 22 25 25
The median in the above-mentioned is the fifth score, that is 18. There are 4 counts on both sides of the numerical value 18.
In cases where you have, for instance:
10 13 14 15 18 19 22 25 26 29
there are 2 numerical values indicating the median. By dividing the result by 2, the median can be determined. The fifth score with a numerical value of 18 and the sixth score with the numerical value of 19 are in the middle of the sequentially ordered scores. The median for the above mentioned scores is therefore 18 + 19 ) 2 = 18,5. Because 18,5 does not occur in the sequentially ordered scores, Huysamen (1983: 50) states that one should in cases of these rather refer to the 18.5 percentile.
* Arithmetic mean
The arithmetic mean refers to a measure of central tendencies found by adding all scores and dividing them by the number of scores. The following is an example:
5 2 6 1 6 = (Sum total of scores wpe7.jpg (788 bytes))
Thus 5 + 2 + 6 + 1 + 6 = 20, because there are 5 scores, N = 5, and the sum total of the scores (20) is divided by 5.
* Standard deviation
The standard deviation is a measure of the spread of dispersion of a distribution of scores. The deviation of each score from the mean is squared; the squared deviations are then summed, the result divided by N-1, and the square root taken (Landman 1988: 94).
* Inference statistics
Apart from descriptive statistics that deal with central tendencies, statistical methods enabling researchers to go from the known to the unknown data also exist. This is to say to make deductions or statements regarding the broad population as the samples from which the 'known' data are drawn. These methods, according to literature are called inferential or inductive statistics (Landman 1988: 95). These methods includes estimation, predictions, hypothesis testing and so forth.
In conclusion the role of statistical methods in research is to enable the researcher to accurately utilize the gathered information and to be more specific in describing his findings. For more details on statistical calculations you are referred to Huysamen (1976).
a) The following marks were allocated to students during a test they wrote:
33 44 69 66 72 46 69
44 61 80 73 42 73 88
62 81 75 50 71 56 86
60 86 54 80 87 63 49
Compile the following on the scores presented to you:
- frequency distribution
- frequency polygon.
b) Calculate the arithmetic mean, mode and median of the following:
8 7 3 21 16 34 22 18 19.
c) Determine the mode and median of the following:
8 11 12 3 31 12 8 9 12 10 5.
6 SOURCE LIST
Huysamen, GK 1976 Beskrywende Statistiek vir Sosiale Wetenskappe. Pretoria: Academica.
Landman, WA 1988 Navorsingsmetodologiese Grondbegrippe. Pretoria: Serva.
Leedy, PD 1985 Practical Research: Planning and Design. Third Edition. New Tork: McMillan Publishing Co.
Smit, GJ 1983 Navorsingsmetodes in die geesteswetenskappe. Pretoria: HAUM.
7 ANSWERS TO SELF TEST
b wpe6.jpg (788 bytes) = 15
Modus = 21
Median = 16
c Modus = 12
Median = 12