Attitude Measurement and Scales

The term scaling is applied to the attempts to measure the attitude objectively. Attitude is a resultant of number of external and internal factors. Depending upon the attitude to be measured, appropriate scales are designed. Scaling is a technique used for measuring qualitative responses of respondents such as those related to their feelings, perception, likes, dislikes, interests and preferences.

Types of Scales

Most frequently used Scales

  1. Nominal Scale
  2. Ordinal Scale
  3. Interval Scale
  4. Ratio Scale

Self Rating Scales

  1. Graphic Rating Scale
  2. Itemized Rating Scales
    1. Likert Scale
    2. Semantic Differential Scale
    3. Stapel’s Scale
    4. Multi Dimensional Scaling
    5. Thurston Scales
    6. Guttman Scales/Scalogram Analysis
    7. The Q Sort technique

Four types of scales are generally used for Marketing Research.

1. Nominal Scale

This is a very simple scale. It consists of assignment of facts/choices to various alternative categories which are usually exhaustive as well mutually exclusive. These scales are just numerical and are the least restrictive of all the scales. Instances of Nominal Scale are – credit card numbers, bank account numbers, employee id numbers etc. It is simple and widely used when relationship between two variables is to be studied. In a Nominal Scale numbers are no more than labels and are used specifically to identify different categories of responses. Following example illustrates –

What is your gender?
[  ] Male
[  ] Female

Another example is – a survey of retail stores done on two dimensions – way of maintaining stocks and daily turnover.

How do you stock items at present?
[  ] By product category
[  ] At a centralized store
[  ] Department wise
[  ] Single warehouse

Daily turnover of consumer is?
[  ] Between 100 – 200
[  ] Between 200 – 300
[  ] Above 300

A two way classification can be made as follows

Daily/Stock Turnover Method Product Category Department wise Centralized Store Single Warehouse
100 – 200        
200 – 300        
Above 300        

Mode is frequently used for response category.

2. Ordinal Scale

Ordinal scales are the simplest attitude measuring scale used in Marketing Research. It is more powerful than a nominal scale in that the numbers possess the property of rank order. The ranking of certain product attributes/benefits as deemed important by the respondents is obtained through the scale.

Example 1: Rank the following attributes (1 – 5), on their importance in a microwave oven.

  1. Company Name
  2. Functions
  3. Price
  4. Comfort
  5. Design

The most important attribute is ranked 1 by the respondents and the least important is ranked 5. Instead of numbers, letters or symbols too can be used to rate in a ordinal scale. Such scale makes no attempt to measure the degree of favourability of different rankings.

Example 2 – If there are 4 different types of fertilizers and if they are ordered on the basis of quality as Grade A, Grade B, Grade C, Grade D is again an Ordinal Scale.

Example 3 – If there are 5 different brands of Talcom Powder and if a respondent ranks them based on say, “Freshness” into Rank 1 having maximum Freshness Rank 2 the second maximum Freshness, and so on, an Ordinal Scale results.

Median and mode are meaningful for ordinal scale.

3. Interval Scale

Herein the distance between the various categories unlike in Nominal, or numbers unlike in Ordinal, are equal in case of Interval Scales. The Interval Scales are also termed as Rating Scales. An Interval Scale has an arbitrary Zero point with further numbers placed at equal intervals. A very good example of Interval Scale is a Thermometer.

illustration 1 How do you rate your present refrigerator for the following qualities.

Company Name Less Known 1 2 3 4 5 Well Known
Functions Few 1 2 3 4 5 Many
Price Low 1 2 3 4 5 High
Design Poor 1 2 3 4 5 Good
Overall Satisfaction Very Dis-Satisfied 1 2 3 4 5 Very Satisfied

Such a scale permits the researcher to say that position 5 on the scale is above position 4 and also the distance from 5 to 4 is same as distance from 4 to 3. Such a scale however does not permit conclusion that position 4 is twice as strong as position 2 because no zero position has been established. The data obtained from the Interval Scale can be used to calculate the Mean scores of each attributes over all respondents. The Standard Deviation (a measure of dispersion) can also be calculated.

4. Ratio Scale

Ratio Scales are not widely used in Marketing Research unless a base item is made available for comparison. In the above example of Interval scale, a score of 4 in one quality does not necessarily mean that the respondent is twice more satisfied than the respondent who marks 2 on the scale. A Ratio scale has a natural zero point and further numbers are placed at equally appearing intervals. For example scales for measuring physical quantities like  length, weight, etc.

The ratio scales are very common in physical scenarios. Quantified responses forming a ratio scale analytically are the most versatile. Rati scale possess all he characteristics of an internal scale, and the ratios of the numbers on these scales have meaningful interpretations. Data on certain demographic or descriptive attributes, if they are obtained through open-ended questions, will have ratio-scale properties. Consider the following questions :

Q 1) What is your annual income before taxes? ______ $
Q 2) How far is the Theater from your home ? ______ miles

Answers to these questions have a natural, unambiguous starting point, namely zero. Since starting point is not chosen arbitrarily, computing and interpreting ratio makes sense. For example we can say that a respondent with an annual income of $ 40,000 earns twice as much as one with an annual income of $ 20,000.

Self Rating Scales

1. Graphic Rating Scale

The respondents rate the objects by placing a mark at the appropriate position on a line that runs from one extreme of the criterion variable to another. Example

0
(poor quality)
1
(bad quality)
5
(neither good nor bad)
7
(good quality)

BRAND 1

This is also known as continuous rating scale. The customer can occupy any position. Here one attribute is taken ex-quality of any brand of icecream.

poor good

BRAND 2

This line can be vertical or horizontal and scale points may be provided. No other indication is there on the continuous scale. A range is provided. To quantify the responses to question that “indicate your overall opinion about ice-ream Brand 2 by placing a tick mark at appropriate position on the line”, we measure the physical distance between the left extreme position and the response position on the line.; the greater the distance, the more favourable is the response or attitude towards the brand.

Its limitation is that coding and analysis will require substantial amount of time, since we first have to measure the physical distances on the scale for each respondent.

2. Itemized Rating Scales

These scales are different from continuous rating scales. They have a number of brief descriptions associated with each category. They are widely used in Marketing Research. They essentially take the form of the multiple category questions. The most common are – Likert, Sementic, Staple and Multiple Dimension. Others are – Thurston and Guttman.

a. Likert Scale

It was developed Rensis Likert. Here the respondents are asked to indicate a degree of agreement and disagreement with each of a series of statement. Each scale item has 5 response categories ranging from strongly agree and strongly disagree.

5
Strongly agree
4
Agree
3
Indifferent
2
Disagree
1
Strongly disagree

Each statement is assigned a numerical score ranging from 1 to 5. It can also be scaled as -2 to +2.

-2 -1 0 1 2

For example quality of Mother Diary ice-cream is poor then Not Good is a negative statement and Strongly Agree with this means the quality is not good.

Each degree of agreement is given a numerical score and the respondents total score is computed by summing these scores. This total score of respondent reveals the particular opinion of a person.

Likert Scale are of ordinal type, they enable one to rank attitudes, but not to measure the difference between attitudes. They take about the same amount of efforts to create as Thurston scale and are considered more discriminating and reliable because of the larger range of responses typically given in Likert scale.

A typical Likert scale has 20 – 30 statements. While designing a good Likert Scale, first a large pool of statements relevant to the measurement of attitude has to be generated and then from the pool statements, the statements which are vague and non-discriminating have to be eliminated.

Thus, likert scale is a five point scale ranging from ’strongly agreement’to ’strongly disagreement’. No judging gap is involved in this method.

b. Semantic Differential Scale

This is a seven point scale and the end points of the scale are associated with bipolar labels.

1
Unpleasant
Submissive
2 3 4 5 6 7
Pleasant
Dominant

Suppose we want to know personality of a particular person. We have options-

  1. Unpleasant/Submissive
  2. Pleasant/Dominant

Bi-polar means two opposite streams. Individual can score between 1 to 7 or -3 to 3. On the basis of these responses profiles are made. We can analyse for two or three products and by joining these profiles we get profile analysis. It could take any shape depending on the number of variables.

Profile Analysis

—————/—————
———-/——————–
——–/———————-

Mean and median are used for comparison. This scale helps to determine overall similarities and differences among objects.

When Semantic Differential Scale is used to develop an image profile, it provides a good basis for comparing images of two or more items. The big advantage of this scale is its simplicity, while producing results compared with those of the more complex scaling methods. The method is easy and fast to administer, but it is also sensitive to small differences in attitude, highly versatile, reliable and generally valid.

c. Stapel’s Scale

It was developed by Jan Stapel. This scale has some distinctive features:-

  • Each item has only one word/phrase indicating the dimension it represents.
  1. Each item has ten response categories.
  • Each item has an even number of categories.
  1. The response categories have numerical labels but no verbal labels.

For example, in the following items, suppose for quality of ice cream, we ask respondents to rank from +5 to -5. Select a plus number for words which best describe the ice cream accurately. Select a minus number for words you think do not describe the ice cream quality accurately. Thus, we can select any number from +5,for words we think are very accurate, to -5,for words we think are very inaccurate. This scale is usually presented vertically.

+5
+4
+3
+2
+1
High Quality
-1
-2
-3
-4
-5

This is a unipolar rating scale.

d. Multi Dimensional Scaling

It consists of a group of analytical techniques which are used to study consumer attitudes related to perceptions and preferences. It is used to study-

  • .The major attributes of a given class of products perceivedby the consumers in considering the product and by which they compare the different ranks.
  1. To study which brand competes most directly with each other.
  2. To find out whether the consumers would like a new brand with a combination of characteristics not found in the market.
  • What would be the consumers ideal combination of product attributes.
  1. What sales and advertising messages are compatible with consumers brand perceptions.

It is a computer based technique. The respondents are asked to place the various brands into different groups like similar, very similar, not similar, and so on. A goodness of fit is traded off on a large number of attributes. Then a lack of fit index is calculated by computer program. The purpose is to find a reasonably small number of dimensions which will eliminate most of the stress. After the configuration for the consumer’s preference has been developed, the next step is to determine the preference with regards to the product under study. These techniques attempt to identify the product attributes that are important to consumers and to measure their relative importance.

This scaling involves a unrealistic assumption that a consumer who compares different brands would perceive the differences on the basis of only one attribute. For example, what are the attributes for joining M.Com course. The responses may be -to do PG, to go into teaching line,to get knowledge, appearing in the NET. There are a number of attributes, you can not base decision on one attribute only. Therefore, when the consumers are choosing between brands, they base their decision on various attributes. In practice, the perceptions of the consumers involve different attributes and any one consumer perceives each brand as a composite of a number of different attributes. This is a shortcoming of this scale.

Whenever we choose from a number of alternatives, go for multi- dimensional scaling. There are many possible uses of such scaling like in market segmentation, product life cycle, vendor evaluations and advertising media selection.

The limitation of this scale is that it is difficult to clearly define the concept of similarities and preferences. Further the distances between the items are seen as different

e. Thurston Scales

These are also known as equal appearing interval scales. They are used to measure the attitude towards a given concept or construct. For this purpose a large number of statements are collected that relate to the concept or construct being measured. The judges rate these statements along an 11 category scale in which each category expresses a different degree of favourableness towards the concept. The items are then ranked according to the mean or median ratings assigned by the judges and are used to construct questionnaire of twenty to thirty items that are chosen more or less evenly across the range of ratings.

The statements are worded in such a way so that a person can agree or disagree with them. The scale is then administered to assemble of respondents whose scores are determined by computing the mean or median value of the items agreed with. A person who disagrees with all the items has a score of zero. So, the advantage of this scale is that it is an interval measurement scale. But it is the time consuming method and labour intensive. They are commonly used in psychology and education research.

f. Guttman Scales/Scalogram Analysis

It is based on the idea that items can be arranged along a continuem in such a way that a person who agrees with an item or finds an item acceptable will also agree with or find acceptable all other items expressing a less extreme position. For example – Children should not be allowed to watch indecent programmes or government should ban these programmes or they are not allowed to air on the television. They all are related to one aspect.

In this scale each score represents a unique set of responses and therefore the total score of every individual is obtained. This scale takes a lot of time and effort in development.

They are very commonly used in political science, anthropology, public opinion, research and psychology.

g. The Q Sort technique

It is used to discriminate among large number of objects quickly. It uses a rank order procedure and the objects are sorted into piles based on similarity with respect to some criteria. The number of objects to be sorted should be between 60-140 approximately. For example, here we are taking nine brands. On the basis of taste we classify the brands into tasty, moderate and non tasty.

We can classify on the basis of price also-Low, medium, high. Then we can attain the perception of people that whether they prefer low priced brand, high or moderate. We can classify sixty brands or pile it into three piles. So the number of objects is to be placed in three piles-low, medium or high.

Thus, the Q-sort technique is an attempt to classify subjects in terms of their similarity to attribute under study.

Importance of Sampling

Types of Sampling

There are many different types of sampling methods, here’s a summary of the most common:

Cluster sampling

Units in the population can often be found in certain geographic groups or “clusters” for example, primary school children in Derbyshire.

A random sample of clusters is taken, then all units within the cluster are examined.

Advantages

  • Quick and easy
  • Doesn’t need complete population information
  • Good for face-to-face surveys

Disadvantages

  • Expensive if the clusters are large
  • Greater risk of sampling error

Convenience sampling

Uses those who are willing to volunteer and easiest to involve in the study.

Advantages

  • Subjects are readily available
  • Large amounts of information can be gathered quickly

Disadvantages

  • The sample is not representative of the entire population, so results can’t speak for them inferences are limited.
  • Prone to volunteer bias

Judgement sampling

A deliberate choice of a sample the opposite of random

Advantages

  • Good for providing illustrative examples or case studies

Disadvantages

  • Very prone to bias
  • Samples often small
  • Cannot extrapolate from sample

Quota sampling

The aim is to obtain a sample that is “representative” of the overall population.

The population is divided (“stratified”) by the most important variables such as income, age and location. The required quota sample is then drawn from each stratum.

Advantages

  • Quick and easy way of obtaining a sample

Disadvantages

  • Not random, so some risk of bias
  • Need to understand the population to be able to identify the basis of stratification

Simply random sampling

This makes sure that every member of the population has an equal chance of selection.

Advantages

  • Simple to design and interpret
  • Can calculate both estimate of the population and sampling error

Disadvantages

  • Need a complete and accurate population listing
  • May not be practical if the sample requires lots of small visits over the country

Systematic sampling

  • After randomly selecting a starting point from the population between 1 and *n, every nth unit is selected.

*n equals the population size divided by the sample size.

Advantages

  • Easier to extract the sample than via simple random
  • Ensures sample is spread across the population

Disadvantages

  • Can be costly and time consuming if the sample is not conveniently located

Importance of Sampling Design

Save Time

Contacting everyone in a population takes time. And, invariably, some people will not respond to the first effort at contacting them, meaning researchers have to invest more time for follow-up. Random sampling is much faster than surveying everyone in a population, and obtaining a non-random sample is almost always faster than random sampling. Thus, sampling saves researchers lots of time.

Save Money

The number of people a researcher contacts is directly related to the cost of a study. Sampling saves money by allowing researchers to gather the same answers from a sample that they would receive from the population.

Non-random sampling is significantly cheaper than random sampling, because it lowers the cost associated with finding people and collecting data from them. Because all research is conducted on a budget, saving money is important.

Collect Richer Data

Sometimes, the goal of research is to collect a little bit of data from a lot of people (e.g., an opinion poll). At other times, the goal is to collect a lot of information from just a few people (e.g., a user study or ethnographic interview). Either way, sampling allows researchers to ask participants more questions and to gather richer data than does contacting everyone in a population.

The Importance of Knowing Where to Sample

Efficient sampling has a number of benefits for researchers. But just as important as knowing how to sample is knowing where to sample. Some research participants are better suited for the purposes of a project than others. Finding participants that are fit for the purpose of a project is crucial, because it allows researchers to gather high-quality data.

For example, consider an online research project. A team of researchers who decides to conduct a study online has several different sources of participants to choose from. Some sources provide a random sample, and many more provide a non-random sample. When selecting a non-random sample, researchers have several options to consider. Some studies are especially well-suited to an online panel that offers access to millions of different participants worldwide. Other studies, meanwhile, are better suited to a crowdsourced site that generally has fewer participants overall but more flexibility for fostering participant engagement.

To make these options more tangible, let’s look at examples of when researchers might use different kinds of online samples.

Methods used for collection of different Data Types

Quantitative data collection methods typically use standardized response categories. Surveys are the most common example. Respondents are asked to choose among responses that best characterize their perceptions, attitudes, knowledge, or opinions. The advantage of quantitative data is that it efficiently measures the reactions of many people which facilitates statistical aggregation of the data, including making comparisons by subgroups. Using sound sampling procedures to represent the population and obtaining adequate response rates are critical. Provided your sample size is large enough, and your methods and analysis are sound, this method of data collection provides a broad, generalizable set of findings. This means that they can be used to learn about the entire population that you are studying.

By contrast, qualitative data collection methods typically produce detailed data about a much smaller number of people. Qualitative data can provide rich information through direct quotation and careful description of programs, events, people, interactions, and observed behaviors. The advantage and disadvantage of such descriptions, quotations, and case studies is that they are collected as open-ended narratives. Observations are not fit to categories so rigorous and systematic analysis of content can be tedious and time-consuming.

One of the most common qualitative data collection techniques is the interview which may be with individuals or a group. In a group interview, or focus group, a moderator conducts a discussion among five to ten people in order to learn their opinions, attitudes, and thought processes about a given topic. The group dynamic encourages a deeper level of discussion and allows the moderator to probe for topics that are important. Note that the term focus group is often misused to refer to any meeting of any group of people about a given topic. In actuality, focus groups as well as individual interviews, are systemically structured and discussion is carefully guided to allow for drawing conclusions and making comparisons. Qualitative data can also be collected from written sources such as journals, open-ended survey questions, and reaction sheets completed by observers or participants.

An ethnographic approach to evaluation collects qualitative data. Maribel Alvarez describes in her case study, Two-Way Mirror: Ethnography as a Way to Assess Civic Impact of Arts-based Engagement in Tucson, AZ, that ethnographic evaluation emphasizes listening carefully and observing real-life actions to understand how people make sense of their lives. An ethnographic evaluation produces “data collection” of a distinct kind subjective accounts of how people actually interact with systems, programs, and policies. This data is collected through experiences of the evaluator in the field, side by side with participants.

Data Collection

Data collection is a process of collecting information from all the relevant sources to find answers to the research problem, test the hypothesis and evaluate the outcomes. Data collection methods can be divided into two categories: secondary methods of data collection and primary methods of data collection.

Methods of data collection for primary and secondary Data

(1) Primary data

Primary data are original observations collected by the researcher or his agent for the first time for any investigation and used by them in the statistical analysis.

The primary data is the one type of important data. It is collection of data from first hand information.

This information published by one organization for some purposes. This type of primary data is mostly pure and original data.

The primary data collection is having three different data collection methods are:-

  • Data Collection through Investigation:

In this method, trained investigators are working as employees for collecting the data. The researchers will use the tools like interview and collect the information from  the individual persons.

  • Personal Investigation Methods:

The researchers or the data collectors will conduct the survey and hence they collect the data. In this method we have to collect more accurate data and original data. This method is useful for small data collection only not big collection of data projects.

  • Data Collection through Telephones:

The data researcher uses the tools like telephones, mobile phones to collect the information or data. This is accurate and very quick process for data collection. But information collected is not accurate and true.

(2) Secondary data

The secondary data is the other type of data, which is collection of data from second hand information. This information is known as, given data is already collected from any one persons for some purpose, and it has available for the present issues. And mostly these secondary data’s are not relevant and pure or original data

Two important methods:

a) Official methods:

Data collecting from the ministry of finance, Agriculture, Industry and etc. These data collection methods are official methods. This methods are used the tools of phone calls and surveys.

b) Semi–official methods:

This is the method of data collection from Railway boards, banks, population committee etc. This methods only used for the focusing groups, and interviews, and electronic mail surveys.

Ways of Collections

In this case the data’s are already available, it means the data’s are already collected and analyzed by someone else. It can be either published or unpublished data. When using the secondary data, the following characteristics must be followed:

  • Reliability
  • Suitability
  • Adequate data

These data’s can be collected from the following places:

a) Official

b) Newspapers and journals

c) Research organizations like universities.

Secondary sources are data that already exist

  • Previous research
  • Official statistics
  • Mass media products
  • Diaries
  • Letters
  • Government reports
  • Web information
  • Historical data and information

Types of Data collection

  1. Observation:

Observation method has occupied an important place in descriptive sociological research. It is the most significant and common technique of data collection. Analysis of questionnaire responses is concerned with what people think and do as revealed by what they put on paper. The responses in interview are revealed by what people express in conversation with the interviewer. Observation seeks to ascertain what people think and do by watching them in action as they express themselves in various situations and activities.

Observation is the process in which one or more persons observe what is occurring in some real life situation and they classify and record pertinent happenings according to some planned schemes. It is used to evaluate the overt behaviour of individuals in controlled or uncontrolled situation. It is a method of research which deals with the external behaviour of persons in appropriate situations.

According to P.V. Young, “Observation is a systematic and deliberate study through eye, of spontaneous occurrences at the time they occur. The purpose of observation is to perceive the nature and extent of significant interrelated elements within complex social phenomena, culture patterns or human conduct”.

From this definition it is clearly understood that observation is a systematic viewing with the help of the eye. Its objective is to discover important mutual relations between spontaneously occurring events and explore the crucial facts of an event or a situation. So it is clearly visible that observation is not simply a random perceiving, but a close look at crucial facts. It is a planned, purposive, systematic and deliberate effort to focus on the significant facts of a situation.

According to Oxford Concise Dictionary, “Observation means accurate watching, knowing of phenomena as they occur in nature with regard to cause and effect or mutual relations”.

This definition focuses on two important points:

Firstly, in observation the observer wants to explore the cause-effect relationships between facts of a phenomenon.

Secondly, various facts are watched accurately, carefully and recorded by the observer.

  1. Interview:

Interview as a technique of data collection is very popular and extensively used in every field of social research. The interview is, in a sense, an oral questionnaire. Instead of writing the response, the interviewee or subject gives the needed information verbally in a face-to-face relationship. The dynamics of interviewing, however, involves much more than an oral questionnaire.

Interview is relatively more flexible tool than any written inquiry form and permits explanation, adjustment and variation according to the situation. The observational methods, as we know, are restricted mostly to non-verbal acts. So these are understandably not so effective in giving information about person’s past and private behaviour, future actions, attitudes, perceptions, faiths, beliefs thought processes, motivations etc.

The interview method as a verbal method is quite significant in securing data about all these aspects. In this method a researcher or an interviewer can interact with his respondents and know their inner feelings and reactions. G.W. Allport in his classic statement sums this up beautifully by saying that “if you want to know how people feel, what they experience and what they remember, what their emotions and motives are like and the reasons for acting as they do, why not ask them”.

Interview is a direct method of inquiry. It is simply stated as a social process in which a person known as the interviewer asks questions usually in a face to face contact to the other person or persons known as interviewee or interviewees. The interviewee responds to these and the interviewer collects various information from these responses through a very healthy and friendly social interaction.

However, it does not mean that all the time it is the interviewer who asks the questions. Often the interviewee may also ask certain questions and the interviewer responds to these. But usually the interviewer initiates the interview and collects the information from the interviewee.

Interview is not a simple two-way conversation between an interrogator and informant. According to P.V. Young, “interview may be regarded as a systematic method by which a person enters more or less imaginatively into the life of a comparative stranger”. It is a mutual interaction of each other.

The objectives of the interviewer are to penetrate the outer and inner life of persons and to collect information pertaining to a wide range of their experiences in which the interviewee may wish to rehearse his past, define his present and canvass his future possibilities. These answers of the interviewees may not be only a response to a question but also a stimulus to progressive series of other relevant statements about social and personal phenomena.

In similar fashion, W.J. Goode and P.K Hatt have observed that “interviewing is fundamentally a process of social interaction”. In the interview two persons are not merely present at the same place but also influence each other emotionally and intellectually.

  1. Schedule:

Schedule is one of the very commonly used tools of data collection in scientific investigation. P.V. Young says “The schedule has been used for collection of personal preferences, social attitudes, beliefs, opinions, behaviour patterns, group practices and habits and much other data”. The increasing use of schedule is probably due to increased emphasis by social scientists on quantitative measurement of uniformly accumulated data.

Schedule is very much similar to questionnaire and there is very little difference between the two so far as their construction is concerned. The main difference between these two is that whereas the schedule is used in direct interview on direct observation and in it the questions are asked and filled by the researcher himself, the questionnaire is generally mailed to the respondent, who fills it up and returns it to the researcher. Thus the main difference between them lies in the method of obtaining data.

Goode and Hatt says, “Schedule is the name usually applied to a set of questions which are asked and filled by an interviewer in a face to face situation with other person”. Webster defines a schedule as “a formal list, a catalogue or inventory and may be a counting device, used in formal and standardized inquiries, the sole purpose of which is aiding in the collection of quantitative cross-sectional data”.

The success of schedule largely depends on the efficiency and tactfulness of the interviewer rather than the quality of questions posed. Because the interviewer himself asks all the questions and fills the answers all by himself, here the quality of question has less significance.

  1. Questionnaire:

Questionnaire provides the most speedy and simple technique of gathering data about groups of individuals scattered in a wide and extended field. In this method, a questionnaire form is sent usually by post to the persons concerned, with a request to answer the questions and return the questionnaire.

According to Goode and Hatt “It is a device for securing answers to questions by using a form which the respondent fills in himself. According to GA. Lundberg “Fundamentally the questionnaire is a set of stimuli to which illiterate people are exposed in order to observe their verbal behaviour under these stimuli”.

Often the term “questionnaire” and “schedule” are considered as synonyms. Technically, however, there is a difference between these two terms. A questionnaire consists of a set of questions printed or typed in a systematic order on a form or set of forms. These form or forms are usually sent by the post to the respondents who are expected to read and understand the questions and reply to them in writing in the spaces given for the purposes on the said form or forms. Here the respondents have to answer the questions on their own.

On the other hand schedule is also a form or set of forms containing a number of questions. But here the researcher or field worker puts the question to the respondent in a face to face situation, clarifies their doubts, offers the necessary explanation and most significantly fills their answers in the relevant spaces provided for the purpose.

Since the questionnaire is sent to a selected number of individuals, its scope is rather limited but within its limited scope it can prove to be the most effective means of eliciting information, provided that it is well formulated and the respondent fills it properly.

A properly constructed and administered questionnaire may serve as a most appropriate and useful data gathering device.

  1. Projective Techniques:

The psychologists and psychiatrists had first devised projective techniques for the diagnosis and treatment of patients afflicted by emotional disorders. Such techniques are adopted to present a comprehensive profile of the individual’s personality structure, his conflicts and complexes and his emotional needs. Adoption of such techniques is not an easy affair. It requires intensive specialized training.

The stimuli applied in projective tests may arouse in the individuals, undergoing the tests, varieties of reaction. Hence, in projective tests the individual’s responses to the stimulus situation are not considerate at their face value because there are no ‘right’ or ‘wrong’ answers. Rather emphasis is laid on his perception or the meaning he attaches to it and the way in which the endeavors to manipulate it or organizes it.

The purpose is never clearly indicated by the nature of the stimuli and the way of their presentation. This also does not provide the way of interpretation of the responses. Since the individual is not asked to describe about himself directly and since he is provided with stimulus in the form of a photograph or a picture or on ink- blot, etc., the responses to these stimuli are construed as the indicators of the individual’s own view of the world, his personality structure, his needs, tensions and anxieties etc., says Bell.

  1. Case Study Method:

According to Biesanz and Biesenz “the case study is a form of qualitative analysis involving the very careful and complete observation of a person, a situation or an institution.” In the words of Goode and Hatt, “Case study is a way of organizing social data so as to preserve the unitary character of the social object being studied.” P.V. young defines case study as a method of exploring and analyzing the life of a social unit, be that a person, a family, an institution, cultural group or even entire community.”

In the words of Giddings “the case under investigation may be one human individual only or only an episode in first life or it might conceivably be a Nation or an epoch of history.” Ruth Strong maintains that “the case history or study is a synthesis and interpretation of information about a person and his relationship to his environment collected by means of many techniques.”

Shaw and Clifford hold that “case study method emphasizes the total situation or combination of factors, the description of the process or consequences of events in which behaviour occurs, the study of individual behaviour in its total setting and the analysis and comparison of cases leading to formulation of hypothesis.”

Test of Hypothesis

Hypothesis Testing Concept

Hypothesis testing is a statistical technique that is used in a variety of situations. Though the technical details differ from situation to situation, all hypothesis tests use the same core set of terms and concepts. The following descriptions of common terms and concepts refer to a hypothesis test in which the means of two populations are being compared.

NULL HYPOTHESIS

The null hypothesis is a clear statement about the relationship between two (or more) statistical objects. These objects may be measurements, distributions, or categories. Typically, the null hypothesis, as the name implies, states that there is no relationship.

In the case of two population means, the null hypothesis might state that the means of the two populations are equal.

ALTERNATIVE HYPOTHESIS

Once the null hypothesis has been stated, it is easy to construct the alternative hypothesis. It is essentially the statement that the null hypothesis is false. In our example, the alternative hypothesis would be that the means of the two populations are not equal.

SIGNIFICANCE

The significance level is a measure of the statistical strength of the hypothesis test. It is often characterized as the probability of incorrectly concluding that the null hypothesis is false.

The significance level is something that you should specify up front. In applications, the significance level is typically one of three values: 10%, 5%, or 1%. A 1% significance level represents the strongest test of the three. For this reason, 1% is a higher significance level than 10%.

POWER

Related to significance, the power of a test measures the probability of correctly concluding that the null hypothesis is true. Power is not something that you can choose. It is determined by several factors, including the significance level you select and the size of the difference between the things you are trying to compare.

Unfortunately, significance and power are inversely related. Increasing significance decreases power. This makes it difficult to design experiments that have both very high significance and power.

TEST STATISTIC

The test statistic is a single measure that captures the statistical nature of the relationship between observations you are dealing with. The test statistic depends fundamentally on the number of observations that are being evaluated. It differs from situation to situation.

DISTRIBUTION OF THE TEST STATISTIC

The whole notion of hypothesis rests on the ability to specify (exactly or approximately) the distribution that the test statistic follows. In the case of this example, the difference between the means will be approximately normally distributed (assuming there are a relatively large number of observations).

ONE-TAILED VS. TWO-TAILED TESTS

Depending on the situation, you may want (or need) to employ a one- or two-tailed test. These tails refer to the right and left tails of the distribution of the test statistic. A two-tailed test allows for the possibility that the test statistic is either very large or very small (negative is small). A one-tailed test allows for only one of these possibilities.

In an example where the null hypothesis states that the two population means are equal, you need to allow for the possibility that either one could be larger than the other. The test statistic could be either positive or negative. So, you employ a two-tailed test.

The null hypothesis might have been slightly different, namely that the mean of population 1 is larger than the mean of population 2. In that case, you don’t need to account statistically for the situation where the first mean is smaller than the second. So, you would employ a one-tailed test.

CRITICAL VALUE

The critical value in a hypothesis test is based on two things: the distribution of the test statistic and the significance level. The critical value(s) refer to the point in the test statistic distribution that give the tails of the distribution an area (meaning probability) exactly equal to the significance level that was chosen.

DECISION

Your decision to reject or accept the null hypothesis is based on comparing the test statistic to the critical value. If the test statistic exceeds the critical value, you should reject the null hypothesis. In this case, you would say that the difference between the two population means is significant. Otherwise, you accept the null hypothesis.

P-VALUE

The p-value of a hypothesis test gives you another way to evaluate the null hypothesis. The p-value represents the highest significance level at which your particular test statistic would justify rejecting the null hypothesis. For example, if you have chosen a significance level of 5%, and the p-value turns out to be .03 (or 3%), you would be justified in rejecting the null hypothesis.

Hypothesis testing was introduced by Ronald Fisher, Jerzy Neyman, Karl Pearson and Pearson’s son, Egon Pearson.   Hypothesis testing is a statistical method that is used in making statistical decisions using experimental data.  Hypothesis Testing is basically an assumption that we make about the population parameter.

Hypothesis Testing is done to help determine if the variation between or among groups of data is due to true variation or if it is the result of sample variation. With the help of sample data we form assumptions about the population, then we have to test our assumptions statistically. This is called Hypothesis testing.

Key terms and concepts:

(i) Null hypothesis: Null hypothesis is a statistical hypothesis that assumes that the observation is due to a chance factor.  Null hypothesis is denoted by; H0: μ1 = μ2, which shows that there is no difference between the two population means.

(ii) Alternative hypothesis: Contrary to the null hypothesis, the alternative hypothesis shows that observations are the result of a real effect.

(iii) Level of significance: Refers to the degree of significance in which we accept or reject the null-hypothesis.  100% accuracy is not possible for accepting or rejecting a hypothesis, so we therefore select a level of significance that is usually 5%.

(iv) Type I error: When we reject the null hypothesis, although that hypothesis was true.  Type I error is denoted by alpha.  In hypothesis testing, the normal curve that shows the critical region is called the alpha region.

(v) Type II errors: When we accept the null hypothesis but it is false.  Type II errors are denoted by beta.  In Hypothesis testing, the normal curve that shows the acceptance region is called the beta region.

(vi) Power: Usually known as the probability of correctly accepting the null hypothesis.  1-beta is called power of the analysis.

(vii) One-tailed test: When the given statistical hypothesis is one value like H0: μ1 = μ2, it is called the one-tailed test.

(viii) Two-tailed test: When the given statistics hypothesis assumes a less than or greater than value, it is called the two-tailed test.

Importance of Hypothesis Testing

Hypothesis testing is one of the most important concepts in statistics because it is how you decide if something really happened, or if certain treatments have positive effects, or if groups differ from each other or if one variable predicts another. In short, you want to proof if your data is statistically significant and unlikely to have occurred by chance alone. In essence then, a hypothesis test is a test of significance.

Possible Conclusions

Once the statistics are collected and you test your hypothesis against the likelihood of chance, you draw your final conclusion. If you reject the null hypothesis, you are claiming that your result is statistically significant and that it did not happen by luck or chance. As such, the outcome proves the alternative hypothesis. If you fail to reject the null hypothesis, you must conclude that you did not find an effect or difference in your study. This method is how many pharmaceutical drugs and medical procedures are tested.

Steps in Hypothesis Testing

Step 1: State the Null Hypothesis

The null hypothesis can be thought of as the opposite of the “guess” the research made (in this example the biologist thinks the plant height will be different for the fertilizers).  So the null would be that there will be no difference among the groups of plants.  Specifically in more statistical language the null for an ANOVA is that the means are the same

Step 2: State the Alternative Hypothesis

The reason we state the alternative hypothesis this way is that if the Null is rejected, there are many possibilities.

For example, [Math Processing Error] is one possibility, as is [Math Processing Error]. Many people make the mistake of stating the Alternative Hypothesis as:  [Math Processing Error] which says that every mean differs from every other mean. This is a possibility, but only one of many possibilities. To cover all alternative outcomes, we resort to a verbal statement of ‘not all equal’ and then follow up with mean comparisons to find out where differences among means exist.  In our example, this means that fertilizer 1 may result in plants that are really tall, but fertilizers 2, 3 and the plants with no fertilizers don’t differ from one another.  A simpler way of thinking about this is that at least one mean is different from all others.

Step 3: Set [Math Processing Error]

If we look at what can happen in a hypothesis test, we can construct the following contingency table:

In Reality
Decision H0 is TRUE H0 is FALSE
Accept H0 OK Type II Error
β = probability of Type II Error
Reject H0 Type I Error
α = probability of Type I Error
OK

You should be familiar with type I and type II errors from your introductory course.  It is important to note that we want to set [Math Processing Error] before the experiment (a-priori) because the Type I error is the more ‘grevious’ error to make. The typical value of [Math Processing Error] is 0.05, establishing a 95% confidence level. For this course we will assume [Math Processing Error] =0.05.

Step 4: Collect Data

Remember the importance of recognizing whether data is collected through an experimental design or observational. 

Step 5: Calculate a test statistic

For categorical treatment level means, we use an F statistic, named after R.A. Fisher. We will explore the mechanics of computing the Fstatistic beginning in Lesson 2. The F value we get from the data is labeled Fcalculated.

Step 6: Construct Acceptance / Rejection regions

As with all other test statistics, a threshold (critical) value of F is established. This F value can be obtained from statistical tables, and is referred to as Fcritical or [Math Processing Error].  As a reminder, this critical value is the minimum value for the test statistic (in this case the F test) for us to be able to reject the null. 

The F distribution, [Math Processing Error], and the location of Acceptance / Rejection regions are shown in the graph below:

Step 7: Based on steps 5 and 6, draw a conclusion about H0

If the Fcalculated from the data is larger than the Fα, then you are in the Rejection region and you can reject the Null Hypothesis with (1-α) level of confidence.

Note that modern statistical software condenses step 6 and 7 by providing a p-value. The p-value here is the probability of getting an Fcalculated even greater than what you observe. If by chance, the Fcalculated = [Math Processing Error], then the p-value would exactly equal to α. With larger Fcalculated values, we move further into the rejection region and the p-value becomes less than α. So the decision rule is as follows:

If the p-value obtained from the ANOVA is less than α, then Reject H0 and Accept HA.

Sampling errors

A Sampling error is a statistical error that occurs when an analyst does not select a sample that represents the entire population of data and the results found in the sample do not represent the results that would be obtained from the entire population. Sampling is an analysis performed by selecting a number of observations from a larger population, and the selection can produce both sampling errors and non-sampling errors.

Sampling error can be eliminated when the sample size is increased and also by ensuring that the sample adequately represents the entire population. Assume, for example, that XYZ Company provides a subscription-based service that allows consumers to pay a monthly fee to stream videos and other programming over the web. The firm wants to survey homeowners who watch at least 10 hours of programming over the web each week and pay for an existing video streaming service. XYZ wants to determine what percentage of the population is interested in a lower-priced subscription service. If XYZ does not think carefully about the sampling process, several types of sampling errors may occur.

Examples of Sampling Error

A population specification error means that XYZ does not understand the specific types of consumers who should be included in the sample. If, for example, XYZ creates a population of people between the ages of 15 and 25 years old, many of those consumers do not make the purchasing decision about a video streaming service because they do not work full-time. On the other hand, if XYZ put together a sample of working adults who make purchase decisions, the consumers in this group may not watch 10 hours of video programming each week.

Selection error also causes distortions in the results of a sample, and a common example is a survey that only relies on a small portion of people who immediately respond. If XYZ makes an effort to follow up with consumers who don’t initially respond, the results of the survey may change. Furthermore, if XYZ excludes consumers who don’t respond right away, the sample results may not reflect the preferences of the entire population.

Sample Size and Sampling Error

Given two exactly the same studies, same sampling methods, same population, the study with a larger sample size will have less sampling process error compared to the study with smaller sample size. Keep in mind that as the sample size increases, it approaches the size of the entire population, therefore, it also approaches all the characteristics of the population, thus, decreasing sampling process error.

Non-Sampling Errors

A non-sampling error is an error that results during data collection, causing the data to differ from the true values. Non-sampling error differs from sampling error. A sampling error is limited to any differences between sample values and universe values that arise because the entire universe was not sampled. Sampling error can result even when no mistakes of any kind are made. The “errors” result from the mere fact that data in a sample is unlikely to perfectly match data in the universe from which the sample is taken. This “error” can be minimized by increasing the sample size. Non-sampling errors cover all other discrepancies, including those that arise from a poor sampling technique.

Non-sampling errors may be present in both samples and censuses in which an entire population is surveyed and may be random or systematic. Random errors are believed to offset each other and therefore are of little concern. Systematic errors, on the other hand, affect the entire sample and are therefore present a greater issue. Non-sampling errors can include but are not limited to, data entry errors, biased survey questions, biased processing/decision making, non-responses, inappropriate analysis conclusions and false information provided by respondents.

While increasing sample size will help minimize sampling error, it will not have any effect on reducing non-sampling error. Unfortunately, non-sampling errors are often difficult to detect, and it is virtually impossible to eliminate them entirely.

Methods to Reduce Sampling Error

Of the two types of errors, sampling error is easier to identify. The biggest techniques for reducing sampling error are:

(i) Increase the sample size.

A larger sample size leads to a more precise result because the study gets closer to the actual population size.

(ii) Divide the population into groups.

Instead of a random sample, test groups according to their size in the population. For example, if people of a certain demographic make up 35% of the population, make sure 35% of the study is made up of this variable.

(iii) Know your population.

The error of population specification is when a research team selects an inappropriate population to obtain data. Know who buys your product, uses it, works with you, and so forth. With basic socio-economic information, it is possible to reach a consistent sample of the population. In cases like marketing research, studies often relate to one specific population like Facebook users, Baby Boomers, or even homeowners.

Methods to Non- Reduce Sampling Error

(i) Thoroughly Pretest your Survey Mediums

As discussed in the example above, it is very important to ensure that your survey and its invites run smoothly through any medium or on any device your potential respondents might use. People are much more likely to ignore survey requests if loading times are long, questions do not fit properly on their screens, or they have to work to make the survey compatible with their device. The best advice is to acknowledge your sample`s different forms of communication software and devices and pre-test your surveys and invites on each, ensuring your survey runs smoothly for all your respondents.

(ii) Avoid Rushed or Short Data Collection Periods

One of the worst things a researcher can do is limit their data collection time in order to comply with a strict deadline. Your study’s level of nonresponse bias will climb dramatically if you are not flexible with the time frames respondents have to answer your survey. Fortunately, flexibility is one of the main advantages to online surveys since they do not require interviews (phone or in person) that must be completed at certain times of the day. However, keeping your survey live for only a few days can still severely limit a potential respondent’s ability to answer. Instead, it is recommended to extend a survey collection period to at least two weeks so that participants can choose any day of the week to respond according to their own busy schedule.

(iii) Send Reminders to Potential Respondents

Sending a few reminder emails throughout your data collection period has been shown to effectively gather more completed responses. It is best to send your first reminder email midway through the collection period and the second near the end of the collection period. Make sure you do not harass the people on your email list who have already completed your survey! You can manage your reminders and invites on FluidSurveys through the trigger options found in the invite tool.

(iv) Ensure Confidentiality

Any survey that requires information that is personal in nature should include reassurance to respondents that the data collected will be kept completely confidential. This is especially the case in surveys that are focused on sensitive issues. Make certain someone reading your invite understands that the information they provide will be viewed as part the whole sample and not individually scrutinized.

(v)  Use Incentives

Many people refuse to respond to surveys because they feel they do not have the time to spend answering questions. An incentive is usually necessary to motivate people into taking part in your study. Depending on the length of the survey, the difficulty in finding the correct respondents (ie: one-legged, 15th-century spoon collectors), and the information being asked, the incentive can range from minimal to substantial in value. Remember, most respondents won’t have an invested interest in your study and must feel that the survey is worth their time!

Sampling and Sampling Distribution

Sample design is the framework, or road map, that serves as the basis for the selection of a survey sample and affects many other important aspects of a survey as well. In a broad context, survey researchers are interested in obtaining some type of information through a survey for some population, or universe, of interest. One must define a sampling frame that represents the population of interest, from which a sample is to be drawn. The sampling frame may be identical to the population, or it may be only part of it and is therefore subject to some under coverage, or it may have an indirect relationship to the population.

Sampling is the process of selecting a subset of individuals, items, or observations from a larger population to analyze and draw conclusions about the entire group. It is essential in statistics when studying the entire population is impractical, time-consuming, or costly. Sampling can be done using various methods, such as random, stratified, cluster, or systematic sampling. The main objectives of sampling are to ensure representativeness, reduce costs, and provide timely insights. Proper sampling techniques enhance the reliability and validity of statistical analysis and decision-making processes.

Steps in Sample Design

While developing a sampling design, the researcher must pay attention to the following points:

  • Type of Universe:

The first step in developing any sample design is to clearly define the set of objects, technically called the Universe, to be studied. The universe can be finite or infinite. In finite universe the number of items is certain, but in case of an infinite universe the number of items is infinite, i.e., we cannot have any idea about the total number of items. The population of a city, the number of workers in a factory and the like are examples of finite universes, whereas the number of stars in the sky, listeners of a specific radio programme, throwing of a dice etc. are examples of infinite universes.

  • Sampling unit:

A decision has to be taken concerning a sampling unit before selecting sample. Sampling unit may be a geographical one such as state, district, village, etc., or a construction unit such as house, flat, etc., or it may be a social unit such as family, club, school, etc., or it may be an individual. The researcher will have to decide one or more of such units that he has to select for his study.

  • Source list:

It is also known as ‘sampling frame’ from which sample is to be drawn. It contains the names of all items of a universe (in case of finite universe only). If source list is not available, researcher has to prepare it. Such a list should be comprehensive, correct, reliable and appropriate. It is extremely important for the source list to be as representative of the population as possible.

  • Size of Sample:

This refers to the number of items to be selected from the universe to constitute a sample. This a major problem before a researcher. The size of sample should neither be excessively large, nor too small. It should be optimum. An optimum sample is one which fulfills the requirements of efficiency, representativeness, reliability and flexibility. While deciding the size of sample, researcher must determine the desired precision as also an acceptable confidence level for the estimate. The size of population variance needs to be considered as in case of larger variance usually a bigger sample is needed. The size of population must be kept in view for this also limits the sample size. The parameters of interest in a research study must be kept in view, while deciding the size of the sample. Costs too dictate the size of sample that we can draw. As such, budgetary constraint must invariably be taken into consideration when we decide the sample size.

  • Parameters of interest:

In determining the sample design, one must consider the question of the specific population parameters which are of interest. For instance, we may be interested in estimating the proportion of persons with some characteristic in the population, or we may be interested in knowing some average or the other measure concerning the population. There may also be important sub-groups in the population about whom we would like to make estimates. All this has a strong impact upon the sample design we would accept.

  • Budgetary constraint:

Cost considerations, from practical point of view, have a major impact upon decisions relating to not only the size of the sample but also to the type of sample. This fact can even lead to the use of a non-probability sample.

  • Sampling procedure:

Finally, the researcher must decide the type of sample he will use i.e., he must decide about the technique to be used in selecting the items for the sample. In fact, this technique or procedure stands for the sample design itself. There are several sample designs (explained in the pages that follow) out of which the researcher must choose one for his study. Obviously, he must select that design which, for a given sample size and for a given cost, has a smaller sampling error.

Types of Samples

  • Probability Sampling (Representative samples)

Probability samples are selected in such a way as to be representative of the population. They provide the most valid or credible results because they reflect the characteristics of the population from which they are selected (e.g., residents of a particular community, students at an elementary school, etc.). There are two types of probability samples: random and stratified.

  • Random Sample

The term random has a very precise meaning. Each individual in the population of interest has an equal likelihood of selection. This is a very strict meaning you can’t just collect responses on the street and have a random sample.

The assumption of an equal chance of selection means that sources such as a telephone book or voter registration lists are not adequate for providing a random sample of a community. In both these cases there will be a number of residents whose names are not listed. Telephone surveys get around this problem by random-digit dialling but that assumes that everyone in the population has a telephone. The key to random selection is that there is no bias involved in the selection of the sample. Any variation between the sample characteristics and the population characteristics is only a matter of chance.

  • Stratified Sample

A stratified sample is a mini-reproduction of the population. Before sampling, the population is divided into characteristics of importance for the research. For example, by gender, social class, education level, religion, etc. Then the population is randomly sampled within each category or stratum. If 38% of the population is college-educated, then 38% of the sample is randomly selected from the college-educated population.

Stratified samples are as good as or better than random samples, but they require fairly detailed advance knowledge of the population characteristics, and therefore are more difficult to construct.

  • Non-probability Samples (Non-representative samples)

As they are not truly representative, non-probability samples are less desirable than probability samples. However, a researcher may not be able to obtain a random or stratified sample, or it may be too expensive. A researcher may not care about generalizing to a larger population. The validity of non-probability samples can be increased by trying to approximate random selection, and by eliminating as many sources of bias as possible.

  • Quota Sample

The defining characteristic of a quota sample is that the researcher deliberately sets the proportions of levels or strata within the sample. This is generally done to insure the inclusion of a particular segment of the population. The proportions may or may not differ dramatically from the actual proportion in the population. The researcher sets a quota, independent of population characteristics.

Example: A researcher is interested in the attitudes of members of different religions towards the death penalty. In Iowa a random sample might miss Muslims (because there are not many in that state). To be sure of their inclusion, a researcher could set a quota of 3% Muslim for the sample. However, the sample will no longer be representative of the actual proportions in the population. This may limit generalizing to the state population. But the quota will guarantee that the views of Muslims are represented in the survey.

  • Purposive Sample

A purposive sample is a non-representative subset of some larger population, and is constructed to serve a very specific need or purpose. A researcher may have a specific group in mind, such as high level business executives. It may not be possible to specify the population they would not all be known, and access will be difficult. The researcher will attempt to zero in on the target group, interviewing whoever is available.

  • Convenience Sample

A convenience sample is a matter of taking what you can get. It is an accidental sample. Although selection may be unguided, it probably is not random, using the correct definition of everyone in the population having an equal chance of being selected. Volunteers would constitute a convenience sample.

Non-probability samples are limited with regard to generalization. Because they do not truly represent a population, we cannot make valid inferences about the larger group from which they are drawn. Validity can be increased by approximating random selection as much as possible, and making every attempt to avoid introducing bias into sample selection.

Sampling Distribution

Sampling Distribution is a statistical concept that describes the probability distribution of a given statistic (e.g., mean, variance, or proportion) derived from repeated random samples of a specific size taken from a population. It plays a crucial role in inferential statistics, providing the foundation for making predictions and drawing conclusions about a population based on sample data.

Concepts of Sampling Distribution

A sampling distribution is the distribution of a statistic (not raw data) over all possible samples of the same size from a population. Commonly used statistics include the sample mean (Xˉ\bar{X}), sample variance, and sample proportion.

Purpose:

It allows statisticians to estimate population parameters, test hypotheses, and calculate probabilities for statistical inference.

Shape and Characteristics:

    • The shape of the sampling distribution depends on the population distribution and the sample size.
    • For large sample sizes, the Central Limit Theorem states that the sampling distribution of the mean will be approximately normal, regardless of the population’s distribution.

Importance of Sampling Distribution

  • Facilitates Statistical Inference:

Sampling distributions are used to construct confidence intervals and perform hypothesis tests, helping to infer population characteristics.

  • Standard Error:

The standard deviation of the sampling distribution, called the standard error, quantifies the variability of the sample statistic. Smaller standard errors indicate more reliable estimates.

  • Links Population and Samples:

It provides a theoretical framework that connects sample statistics to population parameters.

Types of Sampling Distributions

  • Distribution of Sample Means:

Shows the distribution of means from all possible samples of a population.

  • Distribution of Sample Proportions:

Represents the proportion of a certain outcome in samples, used in binomial settings.

  • Distribution of Sample Variances:

Explains the variability in sample data.

Example

Consider a population of students’ test scores with a mean of 70 and a standard deviation of 10. If we repeatedly draw random samples of size 30 and calculate the sample mean, the distribution of those means forms the sampling distribution. This distribution will have a mean close to 70 and a reduced standard deviation (standard error).

Data preparation & preliminary analysis

Data preparation is the process of cleaning and transforming raw data prior to processing and analysis. It is an important step prior to processing and often involves reformatting data, making corrections to data and the combining of data sets to enrich data.

Data preparation is often a lengthy undertaking for data professionals or business users, but it is essential as a prerequisite to put data in context in order to turn it into insights and eliminate bias resulting from poor data quality.

For example, the data preparation process usually includes standardizing data formats, enriching source data, and/or removing outliers.

Benefits of Data Preparation

76% of data scientists say that data preparation is the worst part of their job, but the efficient, accurate business decisions can only be made with clean data. Data preparation helps:

  • Fix errors quickly: Data preparation helps catch errors before processing. After data has been removed from its original source, these errors become more difficult to understand and correct.
  • Produce top-quality data: Cleaning and reformatting datasets ensures that all data used in analysis will be high quality.
  • Make better business decisions: Higher quality data that can be processed and analyzed more quickly and efficiently leads to more timely, efficient and high-quality business decisions.

Additionally, as data and data processes move to the cloud, data preparation moves with it for even greater benefits, such as:

  • Superior scalability: Cloud data preparation can grow at the pace of the business. Enterprise don’t have to worry about the underlying infrastructure or try to anticipate their evolutions.
  • Future proof: Cloud data preparation upgrades automatically so that new capabilities or problem fixes can be turned on as soon as they are released. This allows organizations to stay ahead of the innovation curve without delays and added costs.
  • Accelerated data usage and collaboration: Doing data prep in the cloud means it is always on, doesn’t require any technical installation, and lets teams collaborate on the work for faster results.

Additionally, a good, cloud-native data preparation tool will offer other benefits (like an intuitive and simple to use GUI) for easier and more efficient preparation.

Data Preparation Steps

The specifics of the data preparation process vary by industry, organization and need, but the framework remains largely the same.

1. Gather data

The data preparation process begins with finding the right data. This can come from an existing data catalog or can be added ad-hoc.

2. Discover and assess data

After collecting the data, it is important to discover each dataset. This step is about getting to know the data and understanding what has to be done before the data becomes useful in a particular context.

3. Cleanse and validate data

Cleaning up the data is traditionally the most time consuming part of the data preparation process, but it’s crucial for removing faulty data and filling in gaps. Important tasks here include:

  • Removing extraneous data and outliers.
  • Filling in missing values.
  • Conforming data to a standardized pattern.
  • Masking private or sensitive data entries.

Once data has been cleansed, it must be validated by testing for errors in the data preparation process up to this point. Often times, an error in the system will become apparent during this step and will need to be resolved before moving forward.

4. Transform and enrich data

Transforming data is the process of updating the format or value entries in order to reach a well-defined outcome, or to make the data more easily understood by a wider audience. Enriching data refers to adding and connecting data with other related information to provide deeper insights.

5. Store data

Once prepared, the data can be stored or channeled into a third party application such as a business intelligence tool clearing the way for processing and analysis to take place.

Preliminary Steps in Quantitative Data Analysis

After collecting and before analyzing survey data, we recommend closely examining the data set to ensure the accuracy and representativeness of the information and the integrity of subsequent analyses. Data conditioning involves attending to detailed components of both an actual data set and the particular analytic techniques chosen to examine the data. This often requires more time and attention to detail than either the data collection or the subsequent analytic procedures. Though data conditioning can be a time-intensive step, carefully executing these practices allows one to responsibly proceed with accurately analyzing, interpreting, and reporting quantitative data. In addition, it offers a more fine-grained picture of the study abroad student sample, which can be quite informative even before more focused statistical analyses are begun.

Data Accuracy

The initial step in data conditioning attends to the issue of accurate data entry. This step requires an examination of how data have been entered (or uploaded) into a data file and a consideration of issues that could yield inaccurate analyses. Comparing the actual obtained data to the final data file is an essential step; however, the size of the sample under study affects the method by which this is typically executed. Tabachnick and Fidell (2013) outlined several components to consider in ensuring data accuracy; for example with small data sets, careful proofreading of all variable values is recommended, but for larger data sets, analyzing particular descriptive statistics and graphic representations of variables is typically more efficient in ensuring appropriate variable value ranges (e.g., possible minimum and maximum values). Analyzing descriptive statistics of variables differs depending on the types of variables examined (i.e., categorical or continuous variables). Categorical variables consist of data that are grouped into discrete categories: either nominal classifications devoid of any particular order or ordinal classifications that have a meaningful ranked order. For example, the location of a study abroad program (e.g., Asia, Europe, or South America) is a nominal variable, whereas asking participants to rate their responses to questions along a Likert-type rating scale (e.g., 1 = strongly disagree to 5 = strongly agree, or 1 = poor to 7 = excellent) is an example of an ordinal variable. Though Likert-type scale responses are technically categorical variables, these responses are often treated as continuous variables in data conditioning and later analyses. Continuous variables take on numeric values within a defined range and have equal intervals between data points (e.g., a student’s age or number of months immersed in a host country).

To check data accuracy for categorical variables, evaluators and researchers must examine the frequencies of responses in each possible category. For example, utilizing the frequency function in SPSS will display tables that include the number and percentage of responses in each of a variable’s categories, as well as the number of valid and missing values (after opening SPSS and loading your data file, follow these SPSS menu choices: Analyze > Descriptive Statistics > Frequencies). In addition, various types of charts can also be generated through the same SPSS navigation menu to graphically display frequencies, including bar charts, pie charts, and histograms. In looking at the frequency tables, we can find several questions that are helpful to ask. Are any values out of the range of the numbered categories (e.g., there are three categories of study abroad program types arbitrarily numbered 1 through 3 but the frequency chart or table indicates other number categories beyond these three values)? Finding nonexistent categories easily brings to light these types of data-entry errors. What do the frequencies suggest? How many responses are in each category? Which category contains the lowest and highest number of responses? What are the implications of low or high frequencies in particular categories?

To examine data accuracy for continuous variables (including Likert-type scales), we must analyze other descriptive statistics beyond frequencies. For instance, we often analyze the mean values (the averages) and dispersion (i.e., ranges and minimum-maximum values) of the continuous variables in SPSS (follow these SPSS menu choices: Analyze > Descriptive Statistics > Descriptives > Options) to answer important questions about the accuracy of the data. Do all of the values fall within the range of possible scores? If not, this points to data-entry errors. Do the mean values for the variables make sense based on what is already known about the population under study? The dispersion of a variable is also important to examine, particularly if there are any out-of-range values (i.e., below the minimum or beyond the maximum possible values). In addition, the standard deviation (the amount of variation from the mean) is also important to consider, as this indicates how closely values are dispersed around the sample’s mean. A low standard deviation value suggests that overall scores are generally clustered around the mean with little variation, making the likelihood of finding differences across the sample relatively small. Conversely, a high standard deviation value indicates that the sample’s scores are more widely dispersed across a wider range of scores, indicating a greater likelihood of differences in scores within a sample.

Finally, it is important to ensure that missing data are properly entered and coded in the data file. Data are missing from data files for several reasons, and these must be identified for accurate analyses and reporting. Participants, for instance, may choose not to answer particular questions on a survey, whereas others may have inadvertently skipped several questions or run out of time to complete the survey, leaving some answers blank. Finally, the nature of some survey questions may require participants to legitimately skip particular questions or blocks of questions. In SPSS, missing values are indicated by either an asterisk or the absence of any values. A more thorough discussion of missing data is found later.

Participant Response Rates

Once the data are checked for accuracy, response rates must be carefully examined to understand the representativeness of the sample. For several reasons, it is often not possible to survey, interview, or otherwise investigate every individual from a population of interest. Comparing the sample participants to the larger overall population of interest examining how representative the sample is and discussing any significant distinctions between the two is critical before findings can be understood and applied more broadly. Furthermore, external validity which considers the generalizability of one’s findings or the extent to which one’s findings generalize beyond the current sample to the overall population under study is an important aim of quantitative inquiry.

It is essential to know and report a participant response rate by determining the total number of individuals invited to participate in a study and those who actually participated. This is a simple proportion to calculate by dividing those who participated by the total invited, although it is important to take into account those who never received the initial invitation because of invalid e-mail addresses or returned mail. Beyond understanding response rates, it is necessary to consider how representative a sample is relative to the overall population of interest. How many and what types of individuals compose the overall population under study, and how does this compare to your final sample? Is the sample representative of important demographics of the total population, including race, ethnicity, gender, age, and other salient characteristics? Are there over- or underrepresented groups in your sample? What are the implications of these disparities? If these data are not readily accessible, campus institutional research or enrollment management areas can typically provide assistance in obtaining population data. Although beyond the scope of this chapter, weighting techniques can also be applied to correct for nonresponse biases (see NSSE, 2014).

Missing Data

The issue of missing data is one of the most prevalent quandaries in quantitative research and assessment efforts. In an extended discussion on the implications of and strategies for handling missing data, Tabachnick and Fidell (2013) stated that it is essential to first determine the severity of any missing data, particularly the patterns of missing data, the amount of data missing, and the reasons why the data may be missing. In quantitative research, missing data are often categorized as MCAR (missing completely at random), MAR (missing at random, which constitutes ignorable nonresponses), and MNAR (missing not at random, which constitutes nonignorable nonresponses) (Little, Jorgensen, Lang, & Moore, 2014). Randomly scattered missing values are less serious than nonrandom missing values, as the latter can affect the generalizability of results.

We can determine random from nonrandom missing data by testing for patterns in the missing data. Tabachnick and Fidell (2013) recommended two ways to test for this: First, one can construct a new variable that represents cases with missing and nonmissing values for an independent variable (e.g., a new variable could be created and coded as 0 = missing and 1 = not missing) and then test for mean differences on a continuous outcome measure between the groups using an independent-samples t-test (follow these SPSS menu choices: Analyze > Compare Means > Independent Samples t-Test). We can then examine the SPSS output and determine whether the two means differ significantly. The second strategy Tabachnick and Fidell (2013) outlined is SPSS’s missing value analysis (follow these SPSS menu choices: Analyze > Missing Value Analysis), which highlights the numbers and patterns of missing values by providing statistics including frequencies of missing values, t-tests, and missing patterns.

Once the missing data patterns have been identified, there are a few different approaches and resulting implications in handling missing data that emphasize either excluding or substituting missing values. Excluding cases (participants) with missing data from analyses is a reasonable option if there is a random pattern of missing values, very few participants have missing data, and the participants are missing data on different variables and it appears that the missing cases represent a random subsample of the aggregate sample (Tabachnick & Fidell, 2013). By default, cases with missing values are usually excluded from most analyses in SPSS based on a listwise deletion technique.  Although an acceptable approach  provided that the previous points are considered excluding cases with extensive missing values (over 10% in most cases) can compromise the external validity of the results.

Tabachnick and Fidell (2013) recommended a number of different imputation or substitution approaches to use if a variable is missing extensive data yet is important to the analysis: First, one can use prior knowledge to replace missing values with an informed estimate if the sample is large and the number of missing values is small. For instance, if given experience or expertise in a field one is sure that the missing values would equate to the median, mean, or most frequent response, it is acceptable to substitute those values and note the reasons for doing so. Second, one can transform an ordinal or continuous variable into a dichotomous variable (e.g., participated or did not participate in study abroad; low or high engagement) and predict into which category to place the missing case. For longitudinal data, one can use the last observed value to fill in missing data, but this implies that there was no change over time. Third, one can substitute missing values by inserting an overall sample mean or a subsample mean defined by a particular grouping variable. Finally, one can utilize a regression-based technique on those cases with complete data to generate an equation that substitutes estimated missing values for incomplete cases. In the long run, effective methods of reducing missing data may focus on well-constructed surveys in which students are less likely to leave data blank and exhortations for students to leave no answers blank as they work through the questions.

For those interested in a much more in-depth discussion of missing data analysis, see Enders (2010) for quite thorough overviews and methods of different techniques to handle various types of missing data.

Detecting Outliers (Extreme Values)

Occasionally, outliers or extreme, unexpected values surface in the data and must be addressed, especially with small sample sizes. Participants can randomly respond to questions or represent genuinely rare cases, so it is often helpful to examine the other items attached to a particular participant to see a fuller picture and possibly explain any extreme values. Univariate outliers (an extreme value on one variable) and multivariate outliers (an unusual combination of scores on two or more variables) distort sample statistics (i.e., can lead to either stating there is a relationship or effect when there is not one or failing to detect a relationship or effect when there is one) and interfere with generalizability.

Tabachnick and Fidell (2013) discussed several reasons for outliers: First, incorrect data entry can produce incorrect values, some of which may be outliers (e.g., accidentally typing a value of 22 instead of 2). Second, failure to specify missing-value codes for data that should be read as real data can also produce outliers. Third, an outlier could be from outside of the population from which we wish to sample; we should delete these cases once they are detected, as they are not relevant to our analyses. Finally, an outlier could be from the population of interest, but the distribution of the variable has more extreme values than expected in a normal distribution. In this final case, we can retain these outliers but change the value on the variable so that the outlier’s impact on the analyses is attenuated. Given the more advanced nature of identifying and handling multivariate outliers, we recommend referring to Tabachnick and Fidell 2013) for a more extended discussion.

Looking for Correlations Among Variables

Data conditioning also involves examining the degree to which continuous variables (including Likert-type scales) are correlated – or related – to one another. Note that correlations are not viable using categorical data, as the numerical values of those variables are not meaningful (the numerical values solely serve to categorize data into discrete groups). When examining correlations between continuous variables, correlation coefficients in SPSS will indicate the direction and strength of the correlation between the variables. Correlation coefficients are reported as values between -1.0 and +1.0. (Note: A positive relationship indicates that as one variable either increase or decreases, the other variable increases or decreases in the same manner; a negative relationship indicates that as one variable either increases or decreases, the other variable moves in the opposite direction.) To examine the correlations among all of the continuous variables in a data set, we can produce a correlation matrix in SPSS (follow these SPSS menu choices: Analyze > Correlate > Bivariate), which is simply a table that allows one to see the correlation coefficients for the specified variables to determine the direction (positive or negative) and degree to which they are related with each other.  The closer correlation coefficients are to a value of -1.0 or +1.0, the stronger the negative or positive relationships, whereas the closer these values are to zero, the weaker the relationships.

For example, using responses from two survey items found on the GPI, we are interested in understanding the relationship between the number of multicultural courses taken at college and the degree to which students felt informed of current issues that impact international relations. Intuitively, it might seem that there could be a relationship between these two items, but whether this is statistically significant – and if so, the strength of this relationship – will be useful to understand.  Using the SPSS navigation described earlier, we ran a bivariate (two-variable) correlation on these two items and found a correlation coefficient of 0.058 that was statistically significant. This value indicates that there is a statistically significant and positive (the correlation coefficient was greater than zero) relationship between these variables; in other words, as students complete more multicultural courses, their understanding of current global issues also increase. This correlation coefficient also illustrates, though, that although statistically significant, it is a weak relationship, as the value is very close to zero at 0.058. In this case, our intuition was correct in that these GPI items are, indeed, related, but the weak relationship between them is not that meaningful.

Of particular concern in the data conditioning stage for multivariate analyses is when two or more variables are strongly correlated with each other. For instance, problems can occur when independent variables are highly correlated with each other in the same multivariate model, which may lead to unstable findings, larger standard errors, and a reduced likelihood of statistical significance (see Grimm & Yarnold, 1995, for an expanded discussion of multicollinearity issues). As such, it is important to examine a correlation matrix prior to engaging in multivariate analyses.

Multivariate Analysis of Data

  1. Univariate Data:

This type of data consists of only one variable. The analysis of univariate data is thus the simplest form of analysis since the information deals with only one quantity that changes. It does not deal with causes or relationships and the main purpose of the analysis is to describe the data and find patterns that exist within it. The example of a univariate data can be height.

Heights (in cm) 164 167.3 170 174.2 178 180 186

Suppose that the heights of seven students of a class is recorded (figure 1), there is only one variable that is height and it is not dealing with any cause or relationship. The description of patterns found in this type of data can be made by drawing conclusions using central tendency measures (mean, median and mode), dispersion or spread of data (range, minimum, maximum, quartiles, variance and standard deviation) and by using frequency distribution tables, histograms, pie charts, frequency polygon and bar charts.

  1. Bivariate Data:

This type of data involves two different variables. The analysis of this type of data deals with causes and relationships and the analysis is done to find out the relationship among the two variables. Example of bivariate data can be temperature and ice cream sales in summer season.

Temperature (in celsius)

ICE CREAM Sales

20

2000

25

2500

35

5000

43

7800

Suppose the temperature and ice cream sales are the two variables of a bivariate data (figure 2). Here, the relationship is visible from the table that temperature and sales are directly proportional to each other and thus related because as the temperature increases, the sales also increase. Thus bivariate data analysis involves comparisons, relationships, causes and explanations. These variables are often plotted on X and Y axis on the graph for better understanding of data and one of these variables is independent while the other is dependent.

  1. Multivariate Data:

When the data involves three or more variables, it is categorized under multivariate. Example of this type of data is suppose an advertiser wants to compare the popularity of four advertisements on a website, then their click rates could be measured for both men and women and relationships between variables can then be examined.

It is similar to bivariate but contains more than one dependent variable. The ways to perform analysis on this data depends on the goals to be achieved.Some of the techniques are regression analysis,path analysis,factor analysis and multivariate analysis of variance (MANOVA).

Additional Statistical Methods

1. Mean

The arithmetic mean, more commonly known as “the average,” is the sum of a list of numbers divided by the number of items on the list. The mean is useful in determining the overall trend of a data set or providing a rapid snapshot of your data. Another advantage of the mean is that it’s very easy and quick to calculate.

Pitfall:

Taken alone, the mean is a dangerous tool. In some data sets, the mean is also closely related to the mode and the median (two other measurements near the average). However, in a data set with a high number of outliers or a skewed distribution, the mean simply doesn’t provide the accuracy you need for a nuanced decision.

2. Standard Deviation

The standard deviation, often represented with the Greek letter sigma, is the measure of a spread of data around the mean. A high standard deviation signifies that data is spread more widely from the mean, where a low standard deviation signals that more data align with the mean. In a portfolio of data analysis methods, the standard deviation is useful for quickly determining dispersion of data points.

Pitfall:

Just like the mean, the standard deviation is deceptive if taken alone. For example, if the data have a very strange pattern such as a non-normal curve or a large amount of outliers, then the standard deviation won’t give you all the information you need.

3. Regression

Regression models the relationships between dependent and explanatory variables, which are usually charted on a scatterplot. The regression line also designates whether those relationships are strong or weak. Regression is commonly taught in high school or college statistics courses with applications for science or business in determining trends over time.

Pitfall:

Regression is not very nuanced. Sometimes, the outliers on a scatterplot (and the reasons for them) matter significantly. For example, an outlying data point may represent the input from your most critical supplier or your highest selling product. The nature of a regression line, however, tempts you to ignore these outliers. As an illustration, examine a picture of ANSCOMBE’S QUARTET, in which the data sets have the exact same regression line but include widely different data points.

4. Sample Size Determination

When measuring a large data set or population, like a workforce, you don’t always need to collect information from every member of that population – a sample does the job just as well. The trick is to determine the right size for a sample to be accurate. Using proportion and standard deviation methods, you are able to accurately determine the right sample size you need to make your data collection statistically significant.

Pitfall:

When studying a new, untested variable in a population, your proportion equations might need to rely on certain assumptions. However, these assumptions might be completely inaccurate. This error is then passed along to your sample size determination and then onto the rest of your statistical data analysis

5. Hypothesis Testing

Also commonly called t testing, hypothesis testing assesses if a certain premise is actually true for your data set or population. In data analysis and statistics, you consider the result of a hypothesis test statistically significant if the results couldn’t have happened by random chance. Hypothesis tests are used in everything from science and research to business and economic

Pitfall:

To be rigorous, hypothesis tests need to watch out for common errors. For example, the placebo effect occurs when participants falsely expect a certain result and then perceive (or actually attain) that result. Another common error is the Hawthorne effect (or observer effect), which happens when participants skew results because they know they are being studied.

Overall, these methods of DATA ANALYSIS add a lot of insight to your DECISION-MAKING PORTFOLIO, particularly if you’ve never analyzed a process or data set with statistics before. However, avoiding the common pitfalls associated with each method is just as important. Once you master these fundamental techniques for statistical data analysis, then you’re ready to advance to more powerful data analysis tools.

error: Content is protected !!