**Statistics - Introduction**

The
word Statistics seems to have been derived from the Latin word “status” or the
Italian word Statista. All word means a political state. In early year
“statistics” equipped a collection of facts about the people in the state for
administration or political purpose.

**Webster**defined statistics as “the classified facts representing the conditions of the people in a state, especially those facts which can be stated in numbers or in tables of numbers or in any tabular or classified arrangement.”

A
comprehensive definition was given by

**Prof. Horace Secrist**, which is a follows:-*“By Statistics we mean aggregates of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to a reasonable standards of accuracy, collected in a systematic manner for a predetermined purpose and placed in relation to each other.”*

*The above definitions clearly points out certain characteristics which numerical data must possess in order that they may be called statistics. These are as follows:*

**Single and isolated figures are not statistics because they cannot be compared and no meaningful conclusion can be drawn from it. It is the only aggregate of facts capable of offering some meaningful conclusion that constitute statistics.**

*(i)*__Statistics are aggregates of facts__:*(All statistics are expressed in numbers but all numbers are not statistics)*

*(ii)*__Statistics must be numerically expressed__:**Statistical methods are applicable only to those data which can be numerically expressed. Qualitative expressions like honesty, intelligence, sincere are not statistics unless they can be numerically expressed.**

*(iii)*__Statistics should be capable of being related to each other__:**Statistical data should be capable of comparison and connected to each other. If there is no apparent relationship between the data they cannot be called statistics.**

*(iv)*__Statistics should be collected in a systematic manner__:**For collecting statistical data a suitable plan should be prepared and work should be done accordingly.**

*(v)*__Statistics should be collected for a definite purpose__:**The purpose of collecting data must be decided in advance. The purpose should be specific and well defined.**

*(vi)*__Statistics are affected to a marked extent by a large number of causes__:**Facts and figures are affected to a marked extent by the combined influence of a number of forces.**

*(vii)*__Reasonable standard of accuracy should be maintained in collection of statistics__:**Statistics deals with large number of data. Instead of counting each and every item, Statisticians take a sample and apply the result thus obtained from sample to the whole group. The degree of accuracy of sample largely depends upon the nature and object of the enquiry. If reasonable standard of accuracy is not maintained, numbers may give misleading result.**

**Various stages in statistical investigation:**

**There are five stages in a statistical investigation which are given below:**

*(i)*__Collection of Data__:**Utmost care must be exercised in collecting data as they are the foundation of statistical analysis. If the data are faulty, the conclusions drawn can never be reliable.**

*(ii)*__Organisation of Data__:**Data collected from published sources are generally in organised form but data collected from a survey frequently needs organisation. For meaningful analysis, it is necessary to properly organise the collected data. Organising of data involves three steps which are:**

(a)
Editing
of data

(b)
Classification
of data according to some common characteristics

(c)
Tabulation.

*(iii)*__Presentation of Data__:**Organised data can be further presented in the form of**

*Diagrams*and

*Graphs*.

*(iv)*__Analysis__:**After collection, organisation and presentation, data are analysed by adopting various statistical methods such as measure of central tendency, measure of variation, correlation, regression etc. to dig out information useful for decision-making.**

**The last stage is interpretation which is a difficult task and requires a high degree of skill, care and experience. If the data have been analysed and not properly interpreted, the whole object of investigation may be defeated and wrong conclusion be drawn.**

*(v)*__Interpretation__:**Functions and Limitations of Statistics:**

*The functions of statistics are as follows:*

**Numerical expressions are convincing and, therefore, one of the most important functions of statistics is to present statement in a precise and definite form.**

*(i) It presents fact in a definite form.*

*(ii) It simplifies mass of figures.***The data presented in the form of table, graph or diagram, average or coefficients are simple to understand.**

*(iii) It facilitates comparison.***Once the data are simplified they can be compared with other similar data. Without such comparison the figures would have been useless.**

*(iv) It helps in prediction.***Plans and policies of organisations are invariably formulated in advance at the time of their implementation. knowledge of future trends is very useful in framing suitable policies and plans.**

*(v) It helps in formulating and testing hypothesis.***Statistical methods like z-test, t-test, X**

^{2}-test are extremely helpful in formulating and testing hypothesis and to develop new theories.

*(vi) It helps in the formulation of suitable policies.***Statistics provide the basic material for framing suitable policies. It helps in estimating export, import or production programmes in the light of changes that may occur.**

*(vii) Statistics indicates trend behavior.***Statistical techniques such as Correlation, Regression, Time series analysis etc. are useful in forecasting future events.**

*Limitations of statistics are as follows:*

*(i) Statistics deals only with quantitative characteristics***.**Statistics are numerical statements of facts. Data Which cannot be expressed in numbers are incapable of statistical analysis. Qualitative characteristics like honesty, efficiency, intelligence etc. cannot be studied directly.

*(ii) Statistics deals with aggregates not with individuals.***Since statistics deals with aggregates of facts, the study of individual measurements lies outside the scope of statistics.**

*(iii) Statistical laws are not perfectly accurate.***Statistics deals with such characteristics which are affected by multiplicity of causes and it is not possible to study the effect of these factors. Due to this limitation, the results obtained are not perfectly accurate but only an approximation.**

*(iv) Statistical results are only an average.***Statistical results reveal only the average behavior. The Conclusions obtained statistically are not universally true but they are true only under certain conditions.**

*(v) Statistics is only one of the methods of studying a problem.***Statistical tools do not provide the best solution under all circumstances.**

*(vi) Statistics can be misused.***The greatest limitation of statistics is that they are liable to be misused. The data placed to an inexperienced person may reveal wrong results. Only persons having fundamental knowledge of statistical methods can handle the data properly.**

**Types of statistical data:**

**Statistical data are of two types**

(a)
Primary
data

(b)
Secondary
data.

**Primary Data:**Data which are collected for the first time for a specific purpose are known as Primary data.

*For example: Population census, National income collected by government, Textile Bulletin (Monthly), Reserve bank of*India
Bulletin (Monthly) etc.

**Secondary Data:**Data which are collected by someone else, used in investigation are knows as Secondary data. Data are primary to the collector, but secondary to the user.

*For example: Statistical abstract of the Indian*Union ,
Monthly abstract of statistics, Monthly statistical digest, International
Labour Bulletin (Monthly).

__Merits and Demerits of Primary Data__**:**

__Merits:__

(a)
They
are reliable and accurate.

(b)
If
during collection, the Data are wrong they can be checked again by cross
examination.

(c)
It is more suitable if the field of enquiry is
small.

__Demerits:__
(a)
It
the field of enquiry is too wide, it is not suitable.

(b)
Collection
of primary data is costly and time consuming.

(c)
Personal
Bias, prejudice and whims may affect the data.

__Merits and Demerits of Secondary Data__**:**

__Merits:__

(a)
While
using secondary data, time and labour are saved.

(b)
It
may also be collected from unpublished form.

(c)
If
secondary Data are available, they are much quicker to obtain than primary
data.

__Demerits:__
(a)
Degree
of accuracy may not be acceptable.

(b)
Secondary
Data may or may not fit the need of the project.

(c)
Data
may be influenced by personal bias of investigator.

__Difference between Primary Data and Secondary Data__**:**

(a)
Primary
data are those which are collected for the first time and thus original in
character. While Secondary data are those which are already collected by
someone else.

(b)
Primary
data are in the form of raw-material, whereas Secondary data are in the form of
finished products.

(c)
Primary
data are collected directly from the people related to enquiry while Secondary
data are collected from published materials.

(d)
Data
are primary in the hands of institutions collecting it while they are secondary
for all others.

**Sources of Secondary Data**

Sources
of Secondary Data:

(a)
Official
publication by the central and state governments, district Boards.

(b)
Publication
by research institutions, Universities etc.

(c)
Economic
Journals.

(d)
Commercial
Journals.

(e)
Reports
of Commities, commissions.

(f)
Publications
of trade associations, Chamber of Commerce etc.

__Precautions in the use of Secondary Data__*The following aspects should be considered before use of secondary data:*

__:__
(i)

**Suitability**: The investigator must check before using secondary data that whether they are suitable for the present purpose or not.**(ii) Adequacy:**After satisfying about the suitability of data, the investigator has to determine whether they are adequate for the present purpose of investigators.

**(iii) Dependability:**Dependability of secondary data is determined by the following factors:-

(a)
The
authority which collected the data.

(b)
Procedure
of Sampling followed.

(c)
Status
of Investigator.

**(iv) Units in which data are available.**

__Qualities of Secondary Data:__

(a)
Data
should be reliable

(b)
Data
should be suitable for the purpose of investigator.

(c)
Data
should be adequate

(d)
Data
should be collected by trained investigator.

**Methods of collecting primary Data**

(a)

*Direct Personal Observation**: -*Under this method, the investigator collects the data personally from the persons concerned. The information obtained under this method is original in nature. This method is suitable when the field of enquiry is small.
(b)

**Under this method, the investigator collects the data from third parties capable of supplying the necessary information. This method is suitable where the information to be obtained is of a complex nature and informants cannot be approached directly.***Indirect Oral Investigation: -*
(c)

**A list of question regarding the enquiry is prepared and printed. Data are collected in any of the following ways:-***Schedule and questionnaire: -*
(i) By sending the questionnaire to
the persons concerned with a request to answer the question and return the
questionnaire.

(ii) By sending the questionnaire
through enumerators for helping the informants.

(d)

**This method gives only approximate results at a low cost.***Local reports: -***Questionnaire**

**A Questionnaire is simply a list of questions in a printed sheet relating to survey which the investigators asks to the informants and the answers of the informants are noted down against the respective questions on the sheet. Choice of questions is a very important parts of the enquiry whatever its nature.**

__Characteristics of an ideal Questionnaire:__

(i)
The
Schedule of question must not be lengthy.

(ii)
It
should be clear and simple.

(iii)
Questions
should be arranged in a logical sequence.

(iv)
Each
question should be brief and must aim to some particular information necessary
for the investigation.

(v)
Questions
of personal matter like income of property should be avoided.

(vi)
The
Units of information should be Cleary shown in the sheet.

**Tabulation**

Tabulation refers to the systematic arrangement of the information
in rows and columns. Rows are the horizontal arrangement. In simple words,
tabulation is a layout of figures in rectangular form with appropriate headings
to explain different rows and columns. The main purpose of the table is to
simplify the presentation and to facilitate comparisons.

According to Neiswanger, "A statistical table is a
systematic organisation of data in columns and rows."

**The principal objectives of tabulation are stated below:**

**(i) To make complex data simple:**When data are arranged systematically in a table, such data become more meaningful and can be easily understood.

**(ii) To facilitate comparison:**When different data sets are presented in tables it becomes possible to compare them.

**(iii) To economize space:**A statistical table furnishes maximum information relating to the study in minimum space.

**(iv) To make data fit for analysis and interpretation:**Tabulation serves as a link between the collection of data on the one hand and analysis of such data on the other. In other words, after tabulating the data, it becomes possible to find out their averages, dispersion and correlation. Such statistical measures are necessary for their interpretation.

**(v) To provide reference:**A statistical table can be used as a source of reference for other studies of similar nature.

**Importance of Tabulation:**

a)
Tabulation makes the data brief. Therefore, it can be easily
presented in the form of graphs.

b)
Tabulation presents the numerical figures in an attractive
form.

c)
Tabulation makes complex data simple and as a result of this,
it becomes easy to understand the data.

d)
This form of the presentation of data is helpful in finding
mistakes.

e)
Tabulation is useful in condensing the collected data.

f)
Tabulation makes it easy to analyze the data from tables.

g)
Tabulation is a very cheap mode to present the data. It saves
time as well as space.

h)
Tabulation is a device to summaries the large scattered data.
So, the maximum information may be collected from these tables.

**Limitations of Tabulation**

Tabulation
suffers from the following limitations:

a)
Tables contain only numerical data. They do not contain
details.

b)
qualitative expression is not possible through tables.

c)
Tables can be used by experts only to draw conclusions.
Common men do not understand them properly.

**Classification of Data**

The
process of arranging the data in groups or classes according to their common
characteristics is technically known as classification. Classification is the
grouping of related facts into classes. It is the first step in tabulation.

In
the words of Secrist, "Classification is the process of arranging data
into sequences and groups according to their common characteristics or
separating them into different but related parts."

**Essentials of classification**

a) The
classification must be exhaustive so that every unit of the distribution may
find place in one group or another.

b) Classification
must conform to the objects of investigation.

c) All
the items constituting a group must be homogeneous.

d) Classification
should be elastic so that new facts and figures may easily be adjusted.

e) Classification
should be stable. If it is not so and is changed for every enquiry then the
data would not fit for an enquiry.

f)
The data must not overlap. Each item of the data must be
found in one class.

**Population and Sample**

__Population__

*:***Statistics is taken in relation to a large data. Single and unconnected data is not statistics. In the field of a statistical enquiry there may be persons, items or any other similar units. The aggregate of all such units under consideration is called “Universe or Population”.**

__Sample__

*:***If a part is selected out of the universe then the selected part or portion is known as sample. Sample is only a part of the universe.**

*Sample survey***:**It is a survey under which only a part taken out of the universe is investigated. It is not essential to investigate every individual item of the Universe.

**Under Census survey detail information regarding every individual person or item of a given universe is collected.**

*Census survey and complete enumeration:*

__Difference between Census and Sample survey:__**The following are the differences between Census and Sample method of investigation:**

(a)
Under Census method, each and every individual item is investigated whereas
under sample survey only a part of universe is investigated.

(b)
There is no chance of sampling error in census survey whereas sampling error
cannot be avoided under sample survey.

(c)
Large number of enumerators is required in census whereas less number of
enumerators is required in sample survey.

(d)
Census survey is more time consuming and costly as compared to sample survey.

(e)
Census survey is an old method and it less systematic than the sample survey.

__Merits and Demerits of Census__**:**

__Merits:__

(a)
Since
all the individuals of the universe are investigated, highest degree of
accuracy is obtained.

(b)
Since
there is no possibility of personal bias affecting investigation, this method
is free from sampling error.

(c)
It is more suitable if the field of enquiry is
small.

(d)
Since
all the items of the universe are taken into consideration, all the
characteristics of the universe

__Demerits:__
(a)
It
the field of enquiry is too wide, it is not suitable.

(b)
Collection
of primary data is costly and time consuming.

(c)
Personal
Bias, prejudice and whims may affect the data.

__Merits and Demerits of sample survey__**:**

__Merits:__

(a)
While
using secondary data, time and labour are saved.

(b)
It
may also be collected from unpublished form.

(c)
If
secondary Data are available, they are much quicker to obtain than primary
data.

__Demerits:__
(a)
Degree
of accuracy may not be acceptable.

(b)
Secondary
Data may or may not fit the need of the project.

(c)
Data
may be influenced by personal bias of investigator.