Statistical data sets are collection of data maintained in an organized form. The basis of any statistical analysis has to start with the collection of data, which is then analyzed using statistical tools.
Therefore statistical data sets form the basis from which statistical inferences can be drawn.
Statistical data sets may record as much information as is required by the experiment.
For example, to study the relationship between height and age, only these two parameters might be recorded in the data set.
However, if a more comprehensive study in required, then the experimenter might want to record the height at birth, weight, nutritional background, family history, etc.
Therefore the researcher needs to determine beforehand what kinds of data are required to be recorded in the statistical data sets.
Certain things are common to all statistical data sets. For example, the order of the data does not matter, which means the arrangement of the data within the data set is not important. Therefore the researcher has the freedom to organize the subjects under study in whichever order she finds it convenient.
Creating a statistical data set is only the first step in research. The interpretation and validity of the inferences drawn from the data is what is most important. However, this task is not possible without the data sets. Hence these are the starting point for most research in social sciences, medical sciences and physical sciences.
Huge statistical data sets are already available for many areas.
For example, the international genealogical index contains family history of many people in the past. If a researcher needs to study patterns and statistical data, she can simply make use of these data sets. This makes the job of the researcher much simpler.
A particular statistical data set can be used for a number of researches. The census data, for example, contains comprehensive data about the demographics of a country, which can then by utilized by a number of social scientists to study family structures, incomes, etc. within the country.
A statistical data set is therefore not an end in itself - it is merely the starting point where all the data is stored. How the data is collected and interpreted depends on the researcher studying the data.