home - Coelho Paulo
Probabilistic and statistical methods. Probabilistic-statistical methods of decision-making Estimation of distribution of quantity

Statistical Methods

Statistical methods- methods for analyzing statistical data. Methods of applied statistics are distinguished, which can be applied in all areas of scientific research and any sectors of the national economy, and other statistical methods, the applicability of which is limited to a particular area. This refers to methods such as statistical acceptance control, statistical regulation of technological processes, reliability and testing, and planning of experiments.

Classification of statistical methods

Statistical data analysis methods are used in almost all areas of human activity. They are used whenever it is necessary to obtain and substantiate any judgments about a group (objects or subjects) with some internal heterogeneity.

It is advisable to distinguish three types of scientific and applied activities in the field of statistical methods for data analysis (according to the degree of specificity of methods associated with immersion in specific problems):

a) development and research of general-purpose methods, without taking into account the specifics of the field of application;

b) development and research of statistical models of real phenomena and processes in accordance with the needs of a particular field of activity;

c) the use of statistical methods and models for the statistical analysis of specific data.

Applied statistics

Description of the type of data and the mechanism of their generation is the beginning of any statistical study. Both deterministic and probabilistic methods are used to describe data. Deterministic methods can only analyze the data that are at the disposal of the researcher. For example, with their help, tables were obtained that were calculated by the authorities of official state statistics on the basis of statistical reports submitted by enterprises and organizations. It is possible to transfer the obtained results to a wider set, to use them for prediction and control only on the basis of probabilistic-statistical modeling. Therefore, mathematical statistics often include only methods based on the theory of probability.

We do not consider it possible to oppose deterministic and probabilistic-statistical methods. We see them as sequential stages of statistical analysis. At the first stage, it is necessary to analyze the available data, present it in an easy-to-understand form using tables and diagrams. Then it is advisable to analyze the statistical data on the basis of certain probabilistic and statistical models. Note that the possibility of a deeper penetration into the essence of a real phenomenon or process is provided by the development of an adequate mathematical model.

In the simplest situation, statistical data are the values ​​of some feature characteristic of the objects under study. Values ​​can be quantitative or an indication of the category to which an item can be assigned. In the second case, they speak of a qualitative feature.

When measuring according to several quantitative or qualitative characteristics, we obtain a vector as statistical data about an object. It can be viewed as a new kind of data. In this case, the sample consists of a set of vectors. There are some coordinates - numbers, and some - high-quality (categorized) data, then we are talking about a vector of different types of data.

One element of the sample, that is, one dimension, can be a function as a whole. For example, describing the dynamics of the indicator, that is, its change in time, is the patient's electrocardiogram or the amplitude of the beats of the motor shaft. Or a time series describing the dynamics of the performance of a particular firm. Then the sample consists of a set of functions.

Selection elements can also be other mathematical objects. For example, a binary relationship. So, when interviewing experts, they often use the ordering (ranking) of objects of expertise - product samples, investment projects, options for management decisions. Depending on the rules of the expert study, the sample elements can be various types of binary relations (ordering, partitioning, tolerance), sets, fuzzy sets, etc.

So, the mathematical nature of the sample elements in different problems of applied statistics can be very different. However, two classes of statistics can be distinguished - numeric and non-numeric. Accordingly, applied statistics are divided into two parts - numeric statistics and non-numeric statistics.

Numerical statistics are numbers, vectors, functions. They can be added, multiplied by coefficients. Therefore, in numerical statistics, various amounts are of great importance. The mathematical apparatus for analyzing the sums of random elements of a sample is the (classical) laws of large numbers and central limit theorems.

Non-numerical statistical data are categorized data, vectors of different types of features, binary relations, sets, fuzzy sets, etc. They cannot be added and multiplied by coefficients. Therefore, it makes no sense to talk about the sums of non-numeric statistics. They are elements of non-numeric mathematical spaces (sets). The mathematical apparatus for the analysis of non-numerical statistical data is based on the use of distances between elements (as well as measures of proximity, indicators of difference) in such spaces. With the help of distances, empirical and theoretical averages are determined, the laws of large numbers are proved, nonparametric estimates of the probability distribution density are constructed, problems of diagnostics and cluster analysis are solved, etc. (see).

Various types of statistics are used in applied research. This is due, in particular, to the methods of obtaining them. For example, if the tests of some technical devices continue until a certain point in time, then we get the so-called. censored data, consisting of a set of numbers - the duration of the operation of a number of devices before failure, and information that the rest of the devices continued to work at the time of the end of the test. Censored data is often used to assess and monitor the reliability of technical devices.

Usually, statistical methods for analyzing data of the first three types are considered separately. This limitation is caused by the circumstance noted above that the mathematical apparatus for analyzing data of a non-numerical nature is significantly different than for data in the form of numbers, vectors and functions.

Probabilistic-statistical modeling

When applying statistical methods in specific areas of knowledge and sectors of the national economy, we obtain scientific and practical disciplines such as “statistical methods in industry”, “statistical methods in medicine”, etc. From this point of view, econometrics is “statistical methods in economics”. These disciplines of group b) are usually based on probabilistic-statistical models, built in accordance with the characteristics of the field of application. It is very instructive to compare the probabilistic-statistical models used in various fields, to discover their closeness and, at the same time, to state some differences. Thus, one can see the proximity of the setting of tasks and the statistical methods used to solve them in such areas as scientific medical research, specific sociological research and marketing research, or, in short, in medicine, sociology and marketing. These are often grouped together under the name “sample surveys”.

The difference between sample studies and expert studies is manifested, first of all, in the number of surveyed objects or subjects - in sample studies we usually talk about hundreds, and in expert studies - about dozens. But the technologies of expert research are much more sophisticated. The specificity is even more pronounced in demographic or logistic models, in the processing of narrative (textual, chronicle) information or in the study of the mutual influence of factors.

The issues of reliability and safety of technical devices and technologies, the theory of queuing are considered in detail, in a large number of scientific works.

Statistical analysis of specific data

The application of statistical methods and models for the statistical analysis of specific data is closely tied to the problems of the relevant field. The results of the third of the selected types of scientific and applied activities are at the intersection of disciplines. They can be viewed as examples of the practical application of statistical methods. But there is no less reason to attribute them to the corresponding field of human activity.

For example, the results of a survey of instant coffee consumers can naturally be attributed to marketing (which is what they do when they give lectures on marketing research). The study of the dynamics of price growth using inflation indices calculated on the basis of independently collected information is of interest primarily from the point of view of economics and management of the national economy (both at the macrolevel and at the level of individual organizations).

Development prospects

The theory of statistical methods is aimed at solving real-life problems. Therefore, new formulations of mathematical problems for the analysis of statistical data constantly arise in it, new methods are developed and substantiated. Justification is often carried out mathematically, that is, by proving theorems. The methodological component plays an important role - how exactly to set tasks, what assumptions to make for the purpose of further mathematical study. The role of modern information technologies is great, in particular, a computer experiment.

An urgent task is to analyze the history of statistical methods in order to identify development trends and apply them for forecasting.

Literature

2. Naylor T. Machine simulation experiments with models of economic systems. - M .: Mir, 1975 .-- 500 p.

3. Kramer G. Mathematical methods of statistics. - M .: Mir, 1948 (1st ed.), 1975 (2nd ed.). - 648 p.

4. Bol'shev LN, Smirnov NV Tables of mathematical statistics. - M .: Nauka, 1965 (1st ed.), 1968 (2nd ed.), 1983 (3rd ed.).

5. Smirnov NV, Dunin-Barkovsky IV A course in probability theory and mathematical statistics for technical applications. Ed. 3rd, stereotypical. - Moscow: Nauka, 1969 .-- 512 p.

6. Norman Draper, Harry Smith Applied regression analysis. Multiple Regression = Applied Regression Analysis. - 3rd ed. - M .: "Dialectics", 2007. - S. 912. - ISBN 0-471-17082-8

See also

Wikimedia Foundation. 2010.

  • Yat-Kha
  • Amalgam (disambiguation)

See what "Statistical Methods" are in other dictionaries:

    STATISTICAL METHODS- STATISTICAL METHODS scientific methods for describing and studying mass phenomena that can be quantitative (numerical) expression. The word "statistics" (from Igal. Stato state) has a common root with the word "state". Originally it ... ... Philosophical Encyclopedia

    STATISTICAL METHODS -- scientific methods for the description and study of mass phenomena that can be quantitative (numerical) expression. The word "statistics" (from Italian stato - state) has a common root with the word "state". Initially, it related to the science of management and ... Philosophical Encyclopedia

    Statistical Methods- (in ecology and biocenology) methods of variation statistics, which allow to investigate the whole (eg phytocenosis, population, productivity) by its particular aggregates (eg, according to data obtained at registration sites) and assess the degree of accuracy ... ... Ecological Dictionary

    statistical methods- (in psychology) (from Lat. status state) certain methods of applied mathematical statistics used in psychology mainly for processing experimental results. The main purpose of using S. of m is to increase the validity of conclusions in ... ... Big psychological encyclopedia

    Statistical Methods- 20.2. Statistical Methods Specific statistical methods used to organize, regulate and verify activities include, but are not limited to: a) experimental design and factor analysis; b) analysis of variance and ... Dictionary-reference book of terms of normative and technical documentation

    STATISTICAL METHODS- methods for the study of quantities. sides of mass societies. phenomena and processes. SM make it possible to characterize the ongoing changes in societies in digital terms. processes, study decomp. forms of socio-economic. patterns, change ... ... Agricultural encyclopedic dictionary

    STATISTICAL METHODS- some methods of applied mathematical statistics used to process experimental results. A number of statistical methods have been developed specifically to check the quality of psychological tests, for use in professional ... ... Professional education. Dictionary

    STATISTICAL METHODS- (in engineering psychology) (from lat. status state) some methods of applied statistics used in engineering psychology to process experimental results. The main purpose of using S. of m is to increase the validity of conclusions in ... ... Encyclopedic Dictionary of Psychology and Pedagogy

How are probability theory and mathematical statistics used? These disciplines are the basis of probabilistic and statistical decision-making methods. To use their mathematical apparatus, it is necessary to express decision-making problems in terms of probabilistic-statistical models. The application of a specific probabilistic-statistical decision-making method consists of three stages:

The transition from economic, managerial, technological reality to an abstract mathematical and statistical scheme, i.e. construction of a probabilistic model of a control system, technological process, decision-making procedure, in particular, based on the results of statistical control, etc.

Carrying out calculations and obtaining conclusions by purely mathematical means within the framework of a probabilistic model;

Interpretation of mathematical and statistical conclusions in relation to a real situation and making an appropriate decision (for example, on the conformity or non-conformity of product quality with established requirements, the need to adjust the technological process, etc.), in particular, conclusions (on the proportion of defective product units in a batch, on the specific form of the distribution laws of the controlled parameters of the technological process, etc.).

Mathematical statistics uses the concepts, methods and results of the theory of probability. Let's consider the main issues of constructing probabilistic decision-making models in economic, managerial, technological and other situations. For the active and correct use of normative-technical and instructive-methodological documents on probabilistic-statistical methods of decision-making, preliminary knowledge is required. So, you need to know under what conditions a particular document should be applied, what initial information is necessary to have for its selection and application, what decisions should be made based on the results of data processing, etc.

Application examples probability theory and mathematical statistics. Let's consider a few examples when probabilistic-statistical models are a good tool for solving managerial, production, economic, and national economic problems. So, for example, in the novel by A.N. Tolstoy "Walking through the agony" (vol. 1) it says: "the workshop gives twenty-three percent of the marriage, and you stick to this figure," Strukov said to Ivan Ilyich. "

The question arises how to understand these words in the conversation of factory managers, since one unit of production cannot be 23% defective. It can be either good or defective. Probably, Strukov meant that a large batch contains about 23% of defective items. Then the question arises, what does "approximately" mean? Let 30 out of 100 tested units of production turn out to be defective, or out of 1,000 - 300, or out of 100,000 - 30,000, etc., should Strukov be accused of lying?

Or another example. The coin to be used as a lot must be "symmetrical", i.e. when throwing it, on average, in half of the cases, the coat of arms should fall out, and in half of the cases - the lattice (tails, number). But what does “average” mean? If you carry out many series of 10 tosses in each series, then you will often encounter series in which the coin drops 4 times with the emblem. For a symmetrical coin, this will occur in 20.5% of the series. And if there are 40,000 coats of arms per 100,000 tosses, can the coin be considered symmetrical? The decision-making procedure is based on the theory of probability and mathematical statistics.

The example in question may not seem serious enough. However, it is not. The drawing of lots is widely used in the organization of industrial technical and economic experiments, for example, when processing the results of measuring the quality indicator (friction moment) of bearings depending on various technological factors (the influence of a conservation environment, methods of preparing bearings before measurement, the effect of bearing load during measurement, etc.). NS.). Let's say it is necessary to compare the quality of bearings depending on the results of their storage in different conservation oils, i.e. in oils composition A and V... When planning such an experiment, the question arises of which bearings should be placed in the oil of the composition A, and which ones - into the oil composition V, but so as to avoid subjectivity and ensure the objectivity of the decision.

The answer to this question can be obtained by drawing lots. A similar example can be given with quality control of any product. To decide whether a controlled batch of products meets the established requirements or not, a sample is taken from it. Based on the results of sampling, a conclusion is made about the entire batch. In this case, it is very important to avoid subjectivity in the selection of the sample, that is, it is necessary that each unit of production in the controlled lot has the same probability of being selected in the sample. In production conditions, the selection of units of production in the sample is usually carried out not by lot, but by special tables of random numbers or with the help of computer random number sensors.

Similar problems of ensuring the objectivity of comparison arise when comparing various schemes for organizing production, remuneration, when holding tenders and competitions, selecting candidates for vacant positions, etc. Draws or similar procedures are needed everywhere. Let us explain using the example of identifying the strongest and second strongest team when organizing a tournament according to the Olympic system (the loser is eliminated). Let the stronger team always win the weaker one. It is clear that the strongest team will definitely become the champion. The second strongest team will reach the final if and only if it has no games with the future champion before the final. If such a game is planned, then the second-strongest team will not make it to the final. Anyone planning a tournament can either “knock out” the second-strongest team from the tournament ahead of schedule, bringing it together in the first meeting with the leader, or provide it with a second place, ensuring meetings with weaker teams until the final. To avoid subjectivity, draw lots. For an 8-team tournament, the probability that the two strongest teams will meet in the final is 4/7. Accordingly, with a probability of 3/7, the second-strongest team will leave the tournament ahead of schedule.

Any measurement of product units (using a caliper, micrometer, ammeter, etc.) has errors. To find out whether there are systematic errors, it is necessary to make multiple measurements of a unit of production, the characteristics of which are known (for example, a standard sample). It should be remembered that in addition to the systematic error, there is also a random error.

Therefore, the question arises of how to find out from the measurement results whether there is a systematic error. If we only note whether the error obtained during the next measurement is positive or negative, then this problem can be reduced to the previous one. Indeed, let us compare the measurement with tossing a coin, the positive error - with the fall of the coat of arms, negative - the grating (zero error with a sufficient number of scale divisions practically never occurs). Then checking the absence of a systematic error is equivalent to checking the symmetry of the coin.

The purpose of this reasoning is to reduce the problem of checking the absence of a systematic error to the problem of checking the symmetry of a coin. The above reasoning leads to the so-called "sign criterion" in mathematical statistics.

With the statistical regulation of technological processes on the basis of the methods of mathematical statistics, rules and plans for statistical control of processes are developed, aimed at timely detection of disruptions in technological processes and taking measures to adjust them and prevent the release of products that do not meet the established requirements. These measures are aimed at reducing production costs and losses from the supply of substandard products. In statistical acceptance control, based on the methods of mathematical statistics, quality control plans are developed by analyzing samples from batches of products. The difficulty lies in being able to correctly build probabilistic and statistical decision-making models, on the basis of which it is possible to answer the above questions. In mathematical statistics, probabilistic models and methods for testing hypotheses have been developed for this, in particular, hypotheses that the proportion of defective units of production is equal to a certain number R 0 , for example, R 0 = 0.23 (remember the words of Strukov from the novel by A.N. Tolstoy).

Assessment tasks. In a number of managerial, industrial, economic, and national economic situations, problems of a different type arise - the problem of assessing the characteristics and parameters of probability distributions.

Let's look at an example. Let the batch from N light bulbs. From this batch, a sample of size was randomly selected n light bulbs. A number of natural questions arise. How, based on the results of testing the elements of the sample, to determine the average service life of electric lamps and with what accuracy can this characteristic be estimated? How does the accuracy change if you take a larger sample? At what number of hours T it can be guaranteed that at least 90% of bulbs will last T and more hours?

Suppose that when testing a sample of size n the light bulbs were found to be defective NS light bulbs. Then the following questions arise. What boundaries can be specified for the number D defective light bulbs in a batch, for the level of defectiveness D/ N etc.?

Or, in the statistical analysis of the accuracy and stability of technological processes, it is necessary to evaluate such quality indicators as the average value of the controlled parameter and the degree of its dispersion in the process under consideration. According to the theory of probability, it is advisable to use its mathematical expectation as the mean value of a random variable, and variance, standard deviation, or coefficient of variation as a statistical characteristic of the spread. This raises the question: how to evaluate these statistical characteristics from sample data and with what accuracy can this be done? There are many similar examples. Here it was important to show how the theory of probability and mathematical statistics can be used in production management when making decisions in the field of statistical management of product quality.

What is "mathematical statistics"? Mathematical statistics is understood as “a section of mathematics devoted to the mathematical methods of collecting, organizing, processing and interpreting statistical data, as well as their use for scientific or practical conclusions. The rules and procedures of mathematical statistics are based on the theory of probability, which makes it possible to assess the accuracy and reliability of the conclusions obtained in each problem based on the available statistical material. " In this case, statistical data is called information about the number of objects in some more or less extensive set that have certain characteristics.

According to the type of problems being solved, mathematical statistics is usually divided into three sections: data description, estimation and hypothesis testing.

By the type of processed statistical data, mathematical statistics is divided into four areas:

One-dimensional statistics (statistics of random variables), in which the observation result is described by a real number;

Multivariate statistical analysis, where the result of observation of an object is described by several numbers (vector);

Statistics of random processes and time series, where the observation result is a function;

Statistics of objects of a non-numerical nature, in which the observation result is of a non-numerical nature, for example, it is a set (geometric figure), an ordering, or is obtained as a result of measurement by a qualitative attribute.

Historically, the first to appear were some areas of statistics of objects of a non-numerical nature (in particular, the problem of estimating the proportion of marriage and testing hypotheses about it) and one-dimensional statistics. The mathematical apparatus is simpler for them, therefore, by their example, the basic ideas of mathematical statistics are usually demonstrated.

Only those data processing methods, i.e. mathematical statistics are evidence based on probabilistic models of relevant real phenomena and processes. We are talking about models of consumer behavior, the occurrence of risks, the functioning of technological equipment, obtaining experimental results, the course of the disease, etc. A probabilistic model of a real phenomenon should be considered constructed if the quantities under consideration and the relationships between them are expressed in terms of probability theory. Compliance with the probabilistic model of reality, i.e. its adequacy is substantiated, in particular, with the help of statistical methods for testing hypotheses.

Improbable data processing methods are exploratory, they can only be used for preliminary data analysis, since they do not make it possible to assess the accuracy and reliability of conclusions obtained on the basis of limited statistical material.

Probabilistic and statistical methods are applicable wherever it is possible to build and substantiate a probabilistic model of a phenomenon or process. Their use is mandatory when conclusions drawn from a sample of data are transferred to the entire population (for example, from a sample to an entire batch of products).

In specific areas of application, both probabilistic-statistical methods of widespread use and specific ones are used. For example, in the section of production management devoted to statistical methods of product quality management, applied mathematical statistics (including planning of experiments) are used. Using its methods, a statistical analysis of the accuracy and stability of technological processes and a statistical assessment of quality are carried out. Specific methods include methods of statistical acceptance control of product quality, statistical regulation of technological processes, assessment and control of reliability, etc.

Applied probabilistic and statistical disciplines such as reliability theory and queuing theory are widely used. The content of the first of them is clear from the name, the second is studying systems such as a telephone exchange, which at random times receives calls - the requirements of subscribers dialing numbers on their telephones. The duration of servicing these claims, i.e. the duration of conversations is also modeled with random variables. A great contribution to the development of these disciplines was made by Corresponding Member of the USSR Academy of Sciences A.Ya. Khinchin (1894-1959), Academician of the Academy of Sciences of the Ukrainian SSR B.V. Gnedenko (1912-1995) and other domestic scientists.

Briefly about the history of mathematical statistics. Mathematical statistics as a science begins with the works of the famous German mathematician Karl Friedrich Gauss (1777-1855), who, on the basis of probability theory, investigated and substantiated the least squares method, created by him in 1795 and used to process astronomical data (in order to clarify the orbit of the minor planet Ceres). His name is often called one of the most popular probability distributions - normal, and in the theory of random processes the main object of study is Gaussian processes.

At the end of the XIX century. - the beginning of the twentieth century. a major contribution to mathematical statistics was made by English researchers, primarily K. Pearson (1857-1936) and R.A. Fisher (1890-1962). In particular, Pearson developed the "chi-square" test for testing statistical hypotheses, and Fisher - analysis of variance, the theory of experimental design, the method of maximum likelihood of parameter estimation.

In the 30s of the twentieth century. Pole Jerzy Neumann (1894-1977) and Englishman E. Pearson developed a general theory of testing statistical hypotheses, and Soviet mathematicians Academician A.N. Kolmogorov (1903-1987) and Corresponding Member of the USSR Academy of Sciences N.V. Smirnov (1900-1966) laid the foundations for nonparametric statistics. In the forties of the twentieth century. Romanian A. Wald (1902-1950) built a theory of sequential statistical analysis.

Mathematical statistics is developing rapidly at the present time. So, over the past 40 years, four fundamentally new areas of research can be distinguished:

Development and implementation of mathematical methods for planning experiments;

Development of statistics of objects of non-numerical nature as an independent direction in applied mathematical statistics;

Development of statistical methods that are stable in relation to small deviations from the used probabilistic model;

Widespread development of work on the creation of computer software packages designed for statistical analysis of data.

Probabilistic-statistical methods and optimization. The idea of ​​optimization permeates modern applied mathematical statistics and other statistical methods. Namely, methods of planning experiments, statistical acceptance control, statistical regulation of technological processes, etc. applied mathematical statistics.

In production management, in particular, when optimizing product quality and requirements of standards, it is especially important to apply statistical methods at the initial stage of the product life cycle, i.e. at the stage of research preparation of experimental design developments (development of promising requirements for products, preliminary design, technical specifications for experimental design development). This is due to the limited information available at the initial stage of the product life cycle and the need to predict the technical capabilities and economic situation for the future. Statistical methods should be applied at all stages of solving the optimization problem - when scaling variables, developing mathematical models for the functioning of products and systems, conducting technical and economic experiments, etc.

All areas of statistics are used in optimization problems, including optimization of product quality and requirements of standards. Namely, statistics of random variables, multivariate statistical analysis, statistics of random processes and time series, statistics of objects of non-numerical nature. The choice of a statistical method for the analysis of specific data is advisable to carry out according to the recommendations.

The phenomena of life, like all phenomena of the material world in general, have two inextricably linked sides: qualitative, perceived directly by the senses, and quantitative, expressed in numbers with the help of counting and measure.

In the study of various natural phenomena, both qualitative and quantitative indicators are used simultaneously. There is no doubt that only in the unity of the qualitative and quantitative aspects the essence of the studied phenomena is most fully revealed. However, in reality, you have to use either one or the other indicators.

There is no doubt that quantitative methods, as more objective and accurate, have an advantage over the qualitative characteristics of objects.

The measurement results themselves, although they have a certain value, are still insufficient to draw the necessary conclusions from them. The digital data collected in the process of mass testing is just raw factual material that needs to be mathematically processed accordingly. Without processing - ordering and systematization of digital data, it is not possible to extract the information contained in them, to assess the reliability of individual total indicators, to make sure that the differences observed between them are reliable. This work requires from specialists certain knowledge, the ability to correctly generalize and analyze the data collected in the experience. The system of this knowledge constitutes the content of statistics - a science that deals mainly with the analysis of research results in the theoretical and applied fields of science.

It should be borne in mind that mathematical statistics and probability theory are purely theoretical, abstract sciences; they study statistical aggregates without regard to the specifics of their constituent elements. The methods of mathematical statistics and the theory of probability underlying it are applicable to a wide variety of fields of knowledge, including the humanities.

The study of phenomena is carried out not on individual observations, which may turn out to be random, atypical, incompletely expressing the essence of a given phenomenon, but on a set of homogeneous observations, which gives more complete information about the object under study. A certain set of relatively homogeneous objects, united according to one or another criterion for joint study, is called statistical

aggregate. A set combines a number of homogeneous observations or registrations.

The elements that make up a collection are called its members, or options. ... Variants Are individual observations or numeric values ​​of a characteristic. So, if we denote a feature by X (large), then its values ​​or options will be denoted by x (small), i.e. x 1, x 2, etc.

The total number of options that make up a given population is called its volume and is denoted by the letter n (small).

When the entire set of homogeneous objects as a whole is examined, it is called a general, general, set. An example of this kind of continuous description of a set can be national censuses of the population, a general statistical registration of animals in the country. Of course, a complete survey of the general population provides the most complete information about its condition and properties. Therefore, it is natural for researchers to strive to bring together as many observations as possible.

In reality, however, it is rarely necessary to resort to surveying all members of the general population. Firstly, because this work requires a lot of time and labor, and secondly, it is not always feasible for a variety of reasons and various circumstances. So, instead of a complete survey of the general population, some part of it is usually studied, which is called the sample population, or sample. It is the model by which to judge the entire general population as a whole. For example, in order to find out the average growth of the conscript population of a certain region or district, it is not at all necessary to measure all the conscripts living in a given area, but it is enough to measure some part of them.

1. The sample should be completely representative, or typical, i.e. so that it includes mainly those options that most fully reflect the general population. Therefore, in order to start processing sample data, they are carefully reviewed and clearly atypical variants are removed. For example, when analyzing the cost of products manufactured by an enterprise, the cost in those periods when the enterprise was not fully provided with components or raw materials should be excluded.

2. The sample must be objective. When forming a sample, one should not act arbitrarily, include only those options that seem typical in its composition, and reject all the rest. A good-quality sample is made without preconceived opinions, by the method of drawing of lots or by lottery, when none of the variants of the general population has any advantages over the others - to be included or not to be included in the sample. In other words, the sample should be randomly selected without affecting its composition.

3. The sample should be qualitatively uniform. It is impossible to include in the same sample data obtained under different conditions, for example, the cost of products obtained with a different number of employees.

6.2. Grouping observation results

Usually, the results of experiments and observations are entered in the form of numbers in registration cards or a journal, and sometimes just on sheets of paper - a statement or register is obtained. Such initial documents, as a rule, contain information not about one, but about several signs on which the observations were made. These documents serve as the main source of the formation of the sample. This is usually done like this: on a separate sheet of paper from the primary document, i.e. card index, journal or statement, the numerical values ​​of the attribute by which the aggregate is formed are written out. The options in such a combination are usually presented in the form of a disorderly mass of numbers. Therefore, the first step towards processing such material is ordering, systematizing it - grouping the option into statistical tables or rows.

Statistical tables are one of the most common forms of grouping sample data. They are illustrative, showing some general results, the position of individual elements in the general series of observations.

Another form of primary grouping of sample data is the ranking method, i.e. the location of the variant in a certain order - according to the increasing or decreasing values ​​of the attribute. As a result, a so-called ranked series is obtained, which shows in what limits and how this feature varies. For example, there is a sample of the following composition:

5,2,1,5,7,9,3,5,4,10,4,5,7,3,5, 9,4,12,7,7

It can be seen that the feature varies from 1 to 12 of some units. We arrange the options in ascending order:

1,2,3,3,4,4,4,5,5,5,5,7,7,7,7,9,9,10,12.,

As a result, a ranked series of values ​​of the varying attribute was obtained.

It is clear that the ranking method as shown here is applicable only to small samples. With a large number of observations, the ranking becomes difficult, because the row is so long that it loses its meaning.

With a large number of observations, it is customary to rank the sample in the form of a double series, i.e. indicating the frequency or frequency of individual variants of the ranked series. Such a double series of ranked values ​​of a feature is called a variation series or distribution series. The simplest example of a variation series can be the data ranked above, if they are arranged as follows:

Characteristic values

(options) 1 2 3 4 5 7 9 10 12

repeatability

(option) frequencies 1 1 2 3 5 4 2 1 1

The variation series shows the frequency with which individual variants are found in a given population, how they are distributed, which is of great importance, allowing us to judge the patterns of variation and the range of variation of quantitative traits. The construction of variation series facilitates the calculation of total indicators - the arithmetic mean and variance or dispersion of the variant about their mean - indicators that characterize any statistical population.

Variational series are of two types: discontinuous and continuous. A discontinuous variation series is obtained from the distribution of discrete quantities, which include counting features. If the feature varies continuously, i.e. can take any values ​​in the range from the minimum to the maximum variant of the population, then the latter is distributed in a continuous variation series.

To construct a variational series of a discretely varying feature, it is sufficient to arrange the entire set of observations in the form of a ranked series, indicating the frequencies of individual variants. As an example, we give data showing the size distribution of 267 parts (table 5.4)

Table 6.1. Distribution of parts by size.

To build a variational series of continuously varying features, you need to divide the entire variation from the minimum to the maximum variant into separate groups or intervals (from-to), called classes, and then distribute all the variants of the population among these classes. As a result, a double variation series will be obtained, in which the frequencies no longer refer to individual specific variants, but to the entire interval, i.e. turns out to be frequencies not of an option, but of classes.

The division of the total variation into classes is carried out on the scale of the class interval, which should be the same for all classes of the variation series. The size of the class interval is denoted by i (from the word intervalum - interval, distance); it is determined by the following formula

, (6.1)

where: i - class interval, which is taken as an integer;

- maximum and minimum sample options;

lg.n is the logarithm of the number of classes into which the sample is divided.

The number of classes is set arbitrarily, but taking into account the fact that the number of classes is somewhat dependent on the sample size: the larger the sample size, the more classes should be, and vice versa - with smaller sample sizes, the smaller the number of classes should be taken. Experience has shown that even on small samples, when it is necessary to group variants in the form of a variation series, one should not set less than 5-6 classes. If there is a 100-150 option, the number of classes can be increased to 12-15. If the aggregate consists of 200-300 variants, then it is divided into 15-18 classes, etc. Of course, these recommendations are very conditional and cannot be taken as an established rule.

When breaking down into classes, in each specific case, you have to reckon with a number of different circumstances, ensuring that the processing of statistical material gives the most accurate results.

After the class interval is established and the sample is divided into classes, the variant is posted by class and the number of variations (frequencies) for each class is determined. The result is a variation series in which the frequencies do not belong to individual variants, but to certain classes. The sum of all frequencies of the variation series should be equal to the sample size, that is

(6.2)

where:
-summation sign;

p is the frequency.

n is the sample size.

If there is no such equality, then an error was made when posting the variant by class, which must be eliminated.

Usually, for posting a variant by class, an auxiliary table is drawn up, in which there are four columns: 1) classes for this attribute (from - to); 2) - average value of classes, 3) posting option by class, 4) frequency of classes (see table 6.2.)

Posting an option by class requires a lot of attention. It should not be allowed that the same variant was marked twice or that the same variants fall into different classes. In order to avoid mistakes in the distribution of a variant by classes, it is recommended not to search for the same variants in the aggregate, but to classify them by classes, which is not the same thing. Ignoring this rule, which happens in the work of inexperienced researchers, takes a lot of time when posting an option, and most importantly, leads to errors.

Table 6.2. Post option by class

Class boundaries

Average values ​​of classes (x)

Class frequencies (p),%

absolute

relative

Having finished posting the variation and counting their number for each class, we get a continuous variation series. It must be turned into a discontinuous variation series. For this, as already noted, we take the half-sums of the extreme values ​​of the classes. So, for example, the median value of the first class, equal to 8.8, is obtained as follows:

(8,6+9,0):2=8,8.

The second value (9.3) of this graph is calculated in a similar way:

(9.01 + 9.59): 2 = 9.3, etc.

As a result, a discontinuous variation series is obtained, showing the distribution according to the studied trait (Table 6.3.)

Table 6.3. Variational series

The grouping of sample data in the form of a variation series has a twofold purpose: firstly, as an auxiliary operation, it is necessary when calculating total indicators, and secondly, the distribution series show the regularity of the variation of features, which is very important. To express this pattern more clearly, it is customary to depict the variation series graphically in the form of a histogram (Figure 6.1.)


Figure 6.1 Distribution of enterprises by number of employees

bar graph depicts the distribution of the variant with continuous variation of the characteristic. The rectangles correspond to the classes, and their heights correspond to the number of options enclosed in each class. If from the midpoints of the vertices of the rectangles of the histogram we lower the perpendiculars to the abscissa axis, and then connect these points to each other, we get a graph of continuous variation, called a polygon or distribution density.

How are probability theory and mathematical statistics used? These disciplines are the basis of probabilistic-statistical methods. decision making... To use their mathematical apparatus, you need problems decision making expressed in terms of probabilistic-statistical models. Application of a specific probabilistic-statistical method decision making consists of three stages:

  • transition from economic, managerial, technological reality to an abstract mathematical and statistical scheme, i.e. building a probabilistic model of a control system, technological process, decision-making procedures, in particular, based on the results of statistical control, etc .;
  • making calculations and obtaining conclusions by purely mathematical means within the framework of a probabilistic model;
  • interpretation of mathematical and statistical conclusions in relation to a real situation and making an appropriate decision (for example, on the conformity or non-conformity of product quality with established requirements, the need to adjust the technological process, etc.), in particular, conclusions (on the proportion of defective product units in a batch, on specific form of distribution laws monitored parameters technological process, etc.).

Mathematical statistics uses the concepts, methods and results of the theory of probability. Consider the main issues of building probabilistic models decision making in economic, managerial, technological and other situations. For active and correct use of normative-technical and instructional-methodological documents on probabilistic-statistical methods decision making requires prior knowledge. So, you need to know under what conditions a particular document should be applied, what initial information is necessary to have for its selection and application, what decisions should be made based on the results of data processing, etc.

Examples of the application of probability theory and mathematical statistics... Let's consider a few examples when probabilistic-statistical models are a good tool for solving managerial, production, economic, and national economic problems. So, for example, in the novel by A.N. Tolstoy's "Walking through the agony" (v. 1) says: "The workshop gives twenty-three percent of the marriage, and you stick to this figure," Strukov said to Ivan Ilyich. "

The question arises how to understand these words in the conversation of factory managers, since one unit of production cannot be 23% defective. It can be either good or defective. Probably, Strukov meant that a batch of large volume contains approximately 23% of defective items. Then the question arises, what does "about" mean? Let 30 out of 100 tested units of production turn out to be defective, or out of 1,000-300, or out of 100,000-30,000, etc., should Strukov be accused of lying?

Or another example. The coin to be used as a lot must be "symmetrical", i.e. when throwing it, on average, in half of the cases, the coat of arms should fall out, and in half of the cases - the lattice (tails, number). But what does "average" mean? If you carry out many series of 10 tosses in each series, then you will often encounter series in which the coin drops 4 times with the emblem. For a symmetrical coin, this will occur in 20.5% of the series. And if there are 40,000 coats of arms per 100,000 tosses, can the coin be considered symmetrical? Procedure decision making is based on the theory of probability and mathematical statistics.

The example in question may not seem serious enough. However, it is not. The drawing of lots is widely used in the organization of industrial technical and economic experiments, for example, when processing the results of measuring the quality indicator (friction moment) of bearings depending on various technological factors (the influence of a conservation environment, methods of preparing bearings before measurement, the effect of bearing load during measurement, etc.). NS.). Let's say it is necessary to compare the quality of bearings depending on the results of their storage in different conservation oils, i.e. in composition oils and. When planning such an experiment, the question arises of which bearings should be placed in the oil of the composition, and which ones in the oil of the composition, but in such a way as to avoid subjectivity and ensure the objectivity of the decision.

The answer to this question can be obtained by drawing lots. A similar example can be given with quality control of any product. To decide whether a controlled batch of products meets the established requirements or not, a sample is taken. Based on the results of sampling, a conclusion is made about the entire batch. In this case, it is very important to avoid subjectivity in the selection of the sample, i.e. it is necessary that each item in the controlled lot has the same probability of being selected in the sample. In production conditions, the selection of units of production in the sample is usually carried out not by lot, but by special tables of random numbers or with the help of computer random number sensors.

Similar problems of ensuring the objectivity of comparison arise when comparing different schemes. organization of production, remuneration, during tenders and competitions, selection of candidates for vacant positions, etc. Draws or similar procedures are needed everywhere. Let us explain by the example of identifying the strongest and second strongest teams when organizing a tournament according to the Olympic system (the loser is eliminated). Let the stronger team always win the weaker one. It is clear that the strongest team will definitely become the champion. The second strongest team will reach the final if and only if it has no games with the future champion before the final. If such a game is planned, then the second-strongest team will not make it to the final. Anyone planning a tournament can either "knock out" the second-strongest team from the tournament ahead of schedule, bringing it together in the first meeting with the leader, or provide it with a second place, ensuring meetings with weaker teams until the final. To avoid subjectivity, draw lots. For an 8-team tournament, the probability that the two strongest teams will meet in the final is 4/7. Accordingly, with a probability of 3/7, the second-strongest team will leave the tournament ahead of schedule.

Any measurement of product units (using a caliper, micrometer, ammeter, etc.) has errors. To find out whether there are systematic errors, it is necessary to make multiple measurements of a unit of production, the characteristics of which are known (for example, a standard sample). It should be remembered that in addition to the systematic, there is also a random error.

Therefore, the question arises of how to find out from the measurement results whether there is a systematic error. If we only note whether the error obtained during the next measurement is positive or negative, then this problem can be reduced to the previous one. Indeed, let us compare the measurement with tossing a coin, the positive error - with the fall of the coat of arms, negative - the grating (zero error with a sufficient number of scale divisions practically never occurs). Then checking the absence of a systematic error is equivalent to checking the symmetry of the coin.

The purpose of this reasoning is to reduce the problem of checking the absence of a systematic error to the problem of checking the symmetry of a coin. The above reasoning leads to the so-called "sign criterion" in mathematical statistics.

With the statistical regulation of technological processes on the basis of the methods of mathematical statistics, rules and plans for statistical control of processes are developed, aimed at timely detection of irregularities in technological processes, taking measures to adjust them and preventing the release of products that do not meet the established requirements. These measures are aimed at reducing production costs and losses from the supply of substandard products. In statistical acceptance control, based on the methods of mathematical statistics, quality control plans are developed by analyzing samples from batches of products. The difficulty lies in being able to correctly build probabilistic and statistical models decision making, on the basis of which it is possible to answer the above questions. In mathematical statistics, probabilistic models and methods for testing hypotheses have been developed for this, in particular, hypotheses that the proportion of defective units of production is equal to a certain number, for example, (remember the words of Strukov from the novel by A.N. Tolstoy).

Assessment tasks... In a number of managerial, industrial, economic, and national economic situations, problems of a different type arise - the problem of assessing the characteristics and parameters of probability distributions.

Let's look at an example. Suppose that a batch of N light bulbs was received for inspection. A sample of n light bulbs was randomly selected from this batch. A number of natural questions arise. How, based on the results of testing the elements of the sample, to determine the average service life of electric lamps and with what accuracy can this characteristic be estimated? How does the accuracy change if you take a larger sample? At what number of hours can it be guaranteed that at least 90% of light bulbs will last more than one hour?

Suppose that when testing a sample with a volume of electric lamps, the electric lamps turned out to be defective. Then the following questions arise. What limits can be specified for the number of defective light bulbs in a batch, for the level of defectiveness, etc.?

Or, in a statistical analysis of the accuracy and stability of technological processes, such quality indicators as average monitored parameter and the degree of its spread in the process under consideration. According to the theory of probability, it is advisable to use its mathematical expectation as the mean value of a random variable, and variance, standard deviation, or the coefficient of variation... This raises the question: how to evaluate these statistical characteristics from sample data and with what accuracy can this be done? There are many similar examples. Here it was important to show how the theory of probability and mathematical statistics can be used in production management when making decisions in the field of statistical management of product quality.

What is "mathematical statistics"? Mathematical statistics is understood as "a section of mathematics devoted to mathematical methods for collecting, systematizing, processing and interpreting statistical data, as well as using them for scientific or practical conclusions. The rules and procedures of mathematical statistics are based on the theory of probability, which makes it possible to assess the accuracy and reliability of conclusions obtained in each problem based on the available statistical material "[[2.2], p. 326]. In this case, statistical data is called information about the number of objects in some more or less extensive set that have certain characteristics.

According to the type of problems being solved, mathematical statistics is usually divided into three sections: data description, estimation and hypothesis testing.

By the type of processed statistical data, mathematical statistics is divided into four areas:

  • one-dimensional statistics (statistics of random variables), in which the observation result is described by a real number;
  • multivariate statistical analysis, where the result of observation over an object is described by several numbers (vector);
  • statistics of random processes and time series, where the observation result is a function;
  • statistics of objects of a non-numerical nature, in which the observation result is of a non-numerical nature, for example, it is a set (geometric figure), an ordering, or is obtained as a result of measurement by a qualitative attribute.

Historically, some areas of statistics of objects of a non-numerical nature (in particular, problems of estimating the proportion of marriage and testing hypotheses about it) and one-dimensional statistics were the first to appear. The mathematical apparatus is simpler for them, therefore, by their example, the basic ideas of mathematical statistics are usually demonstrated.

Only those data processing methods, i.e. mathematical statistics are evidence based on probabilistic models of relevant real phenomena and processes. We are talking about models of consumer behavior, the occurrence of risks, the functioning of technological equipment, obtaining experimental results, the course of the disease, etc. A probabilistic model of a real phenomenon should be considered constructed if the quantities under consideration and the relationships between them are expressed in terms of probability theory. Compliance with the probabilistic model of reality, i.e. its adequacy is substantiated, in particular, with the help of statistical methods for testing hypotheses.

Improbable data processing methods are exploratory, they can only be used for preliminary data analysis, since they do not make it possible to assess the accuracy and reliability of conclusions obtained on the basis of limited statistical material.

Probabilistic and statistical methods are applicable wherever it is possible to construct and substantiate a probabilistic model of a phenomenon or process. Their use is mandatory when conclusions drawn from a sample of data are transferred to the entire population (for example, from a sample to an entire batch of products).

In specific applications, they are used as probabilistic statistical methods widespread use, and specific. For example, in the section of production management devoted to statistical methods of product quality management, applied mathematical statistics (including planning of experiments) are used. With the help of her methods, statistical analysis accuracy and stability of technological processes and statistical quality assessment. The specific methods include methods of statistical acceptance control of product quality, statistical regulation of technological processes, assessment and control of reliability, etc.

Applied probabilistic and statistical disciplines such as reliability theory and queuing theory are widely used. The content of the first of them is clear from the name, the second is studying systems such as a telephone exchange, which at random times receives calls - the requirements of subscribers dialing numbers on their telephones. The duration of servicing these claims, i.e. the duration of conversations is also modeled with random variables. A great contribution to the development of these disciplines was made by Corresponding Member of the USSR Academy of Sciences A.Ya. Khinchin (1894-1959), Academician of the Academy of Sciences of the Ukrainian SSR B.V. Gnedenko (1912-1995) and other domestic scientists.

Briefly about the history of mathematical statistics... Mathematical statistics as a science begins with the works of the famous German mathematician Karl Friedrich Gauss (1777-1855), who, based on the theory of probability, investigated and substantiated least square method, created by him in 1795 and used for processing astronomical data (in order to clarify the orbit of the minor planet Ceres). His name is often called one of the most popular probability distributions - normal, and in the theory of random processes the main object of study is Gaussian processes.

At the end of the XIX century. - the beginning of the twentieth century. a major contribution to mathematical statistics was made by English researchers, primarily K. Pearson (1857-1936) and R.A. Fisher (1890-1962). In particular, Pearson developed the chi-square test for statistical hypotheses, and Fisher developed analysis of variance, experiment planning theory, maximum likelihood parameter estimation method.

In the 30s of the twentieth century. Pole Jerzy Neumann (1894-1977) and Englishman E. Pearson developed a general theory of testing statistical hypotheses, and Soviet mathematicians Academician A.N. Kolmogorov (1903-1987) and Corresponding Member of the USSR Academy of Sciences N.V. Smirnov (1900-1966) laid the foundations for nonparametric statistics. In the forties of the twentieth century. Romanian A. Wald (1902-1950) built a theory of sequential statistical analysis.

Mathematical statistics is developing rapidly at the present time. So, over the past 40 years, four fundamentally new areas of research can be distinguished [[2.16]]:

  • development and implementation of mathematical methods for planning experiments;
  • development of statistics of objects of non-numerical nature as an independent direction in applied mathematical statistics;
  • development of statistical methods that are stable in relation to small deviations from the used probabilistic model;
  • widespread development of work on the creation of computer software packages designed for statistical analysis of data.

Probabilistic-statistical methods and optimization... The idea of ​​optimization permeates modern applied mathematical statistics and other statistical methods... Namely - methods of planning experiments, statistical acceptance control, statistical regulation of technological processes, etc. On the other hand, optimization statements in theory decision making, for example, the applied theory of optimization of product quality and the requirements of standards, provide for the widespread use of probabilistic and statistical methods, primarily applied mathematical statistics.

In production management, in particular when optimizing product quality and standard requirements, it is especially important to apply statistical methods at the initial stage of the product life cycle, i.e. at the stage of research preparation of experimental design developments (development of promising requirements for products, preliminary design, technical specifications for experimental design development). This is due to the limited information available at the initial stage of the product life cycle and the need to predict the technical capabilities and economic situation for the future. Statistical Methods should be used at all stages of solving the optimization problem - when scaling variables, developing mathematical models for the functioning of products and systems, conducting technical and economic experiments, etc.

All areas of statistics are used in optimization problems, including optimization of product quality and requirements of standards. Namely - statistics of random variables, multidimensional statistical analysis, statistics of random processes and time series, statistics of objects of non-numerical nature. The choice of a statistical method for the analysis of specific data is advisable to carry out according to the recommendations [

In accordance with the three main possibilities - decision-making under conditions of complete certainty, risk and uncertainty - methods and algorithms for decision-making can be divided into three main types: analytical, statistical and based on fuzzy formalization. In each case, the decision-making method is selected based on the task, the available initial data, the available problem models, the decision-making environment, the decision-making process, the required decision accuracy, and the analyst's personal preferences.

In some information systems, the process of choosing an algorithm can be automated:

The corresponding automated system has the ability to use a variety of different types of algorithms (library of algorithms);

The system interactively prompts the user to answer a number of questions about the main characteristics of the problem under consideration;

Based on the results of the user's answers, the system offers the most suitable (in accordance with the criteria specified in it) algorithm from the library.

1 Probabilistic-statistical methods of decision-making

Probabilistic statistical decision-making methods (MPR) are used when the efficiency of the decisions made depends on factors that are random variables for which the laws of probability distribution and other statistical characteristics are known. Moreover, each decision can lead to one of many possible outcomes, with each outcome having a certain probability of occurrence, which can be calculated. The indicators characterizing the problem situation are also described using probabilistic characteristics. With such DPD, the decision maker always runs the risk of getting the wrong result, which he is guided by, choosing the optimal solution based on the averaged statistical characteristics of random factors, that is, the decision is made under risk conditions.

In practice, probabilistic and statistical methods are often used when conclusions drawn from a sample of data are transferred to the entire population (for example, from a sample to an entire batch of products). However, in each specific situation, one should first assess the fundamental possibility of obtaining sufficiently reliable probabilistic and statistical data.

When using the ideas and results of the theory of probability and mathematical statistics when making decisions, the basis is a mathematical model, in which objective relations are expressed in terms of the theory of probability. Probabilities are used primarily to describe randomness that must be taken into account when making decisions. This refers to both unwanted opportunities (risks) and attractive ones ("lucky chance").

The essence of probabilistic-statistical decision-making methods is to use probabilistic models based on the estimation and testing of hypotheses using sample characteristics.

We emphasize that the logic of using sample characteristics for making decisions based on theoretical models involves the simultaneous use of two parallel series of concepts- related to theory (probabilistic model) and related to practice (sample of observation results). For example, the theoretical probability corresponds to the frequency found from the sample. The mathematical expectation (theoretical series) corresponds to the sample arithmetic mean (practical series). Typically, sample characteristics are estimates of theoretical characteristics.

The advantages of using these methods include the ability to take into account various scenarios for the development of events and their probabilities. The disadvantage of these methods is that the values ​​of the probabilities of the scenarios used in the calculations are usually very difficult to obtain in practice.

The application of a specific probabilistic-statistical decision-making method consists of three stages:

The transition from economic, managerial, technological reality to an abstract mathematical and statistical scheme, i.e. construction of a probabilistic model of a control system, technological process, decision-making procedure, in particular, based on the results of statistical control, etc.

Carrying out calculations and obtaining conclusions by purely mathematical means within the framework of a probabilistic model;

Interpretation of mathematical and statistical conclusions in relation to a real situation and making an appropriate decision (for example, on the conformity or non-conformity of product quality with established requirements, the need to adjust the technological process, etc.), in particular, conclusions (on the proportion of defective product units in a batch, on the specific form of the distribution laws of the controlled parameters of the technological process, etc.).

A probabilistic model of a real phenomenon should be considered constructed if the quantities under consideration and the relationships between them are expressed in terms of probability theory. The adequacy of the probabilistic model is substantiated, in particular, with the help of statistical methods for testing hypotheses.

Mathematical statistics by the type of problems being solved are usually divided into three sections: data description, estimation and hypothesis testing. By the type of processed statistical data, mathematical statistics is divided into four areas:

One-dimensional statistics (statistics of random variables), in which the observation result is described by a real number;

Multivariate statistical analysis, where the result of observation of an object is described by several numbers (vector);

Statistics of random processes and time series, where the observation result is a function;

Statistics of objects of a non-numerical nature, in which the observation result is of a non-numerical nature, for example, it is a set (geometric figure), an ordering, or is obtained as a result of measurement by a qualitative attribute.

An example when it is advisable to use probabilistic-statistical models.

When controlling the quality of any product, a sample is taken from it to decide whether the produced batch of products meets the established requirements. Based on the results of sampling, a conclusion is made about the entire batch. In this case, it is very important to avoid subjectivity in the selection of the sample, that is, it is necessary that each unit of production in the controlled lot has the same probability of being selected in the sample. The choice by lot in such a situation is not objective enough. Therefore, in production conditions, the selection of units of production in the sample is usually carried out not by lot, but by special tables of random numbers or with the help of computer random number sensors.

With the statistical regulation of technological processes on the basis of the methods of mathematical statistics, rules and plans for statistical control of processes are developed, aimed at timely detection of disruptions in technological processes and taking measures to adjust them and prevent the release of products that do not meet the established requirements. These measures are aimed at reducing production costs and losses from the supply of substandard products. In statistical acceptance control, based on the methods of mathematical statistics, quality control plans are developed by analyzing samples from batches of products. The difficulty lies in being able to correctly build probabilistic and statistical decision-making models, on the basis of which it is possible to answer the above questions. In mathematical statistics, probabilistic models and methods for testing hypotheses have been developed for this3.

In addition, in a number of managerial, production, economic, national economic situations, problems of a different type arise - the problem of assessing the characteristics and parameters of probability distributions.

Or, in the statistical analysis of the accuracy and stability of technological processes, it is necessary to evaluate such quality indicators as the average value of the controlled parameter and the degree of its dispersion in the process under consideration. According to the theory of probability, it is advisable to use its mathematical expectation as the mean value of a random variable, and variance, standard deviation, or coefficient of variation as a statistical characteristic of the spread. This raises the question: how to evaluate these statistical characteristics from sample data and with what accuracy can this be done? There are many similar examples in the literature. They all show how probability theory and mathematical statistics can be used in production management in making decisions in the field of statistical product quality management.

In specific areas of application, both probabilistic-statistical methods of widespread use and specific ones are used. For example, in the section of production management devoted to statistical methods of product quality management, applied mathematical statistics (including planning of experiments) are used. Using its methods, a statistical analysis of the accuracy and stability of technological processes and a statistical assessment of quality are carried out. Specific methods include methods of statistical acceptance control of product quality, statistical regulation of technological processes, assessment and control of reliability, etc.

In production management, in particular, when optimizing product quality and ensuring compliance with standard requirements, it is especially important to apply statistical methods at the initial stage of the product life cycle, i.e. at the stage of research preparation of experimental design developments (development of promising requirements for products, preliminary design, technical specifications for experimental design development). This is due to the limited information available at the initial stage of the product life cycle and the need to predict the technical capabilities and economic situation for the future.

The most common probabilistic statistical methods are regression analysis, factor analysis, analysis of variance, statistical risk assessment methods, scenario method, etc. The area of ​​statistical methods, devoted to the analysis of statistical data of a non-numerical nature, is becoming increasingly important. measurement results for qualitative and diverse characteristics. One of the main applications of statistics of objects of a non-numerical nature is the theory and practice of expert judgments related to the theory of statistical decisions and voting problems.

The role of a person in solving problems using the methods of the theory of statistical decisions is to formulate the problem, i.e., to bring the real problem to the corresponding standard one, to determine the probabilities of events on the basis of statistical data, and also to approve the obtained optimal solution.

 


Read:



Scholarship of the government of the Russian Federation in priority areas of modernization and technological development of the Russian economy

Scholarship of the government of the Russian Federation in priority areas of modernization and technological development of the Russian economy

The presidential scholarship received legislative approval even during the time of the first ruler of Russia B.N. Yeltsin. At that time, she was appointed only to ...

Help for applicants: how to get a targeted referral to study at a university

Help for applicants: how to get a targeted referral to study at a university

Hello dear readers of the blog site. Today I would like to remind or tell applicants about the target direction, its pros and cons ...

Preparing for the exam for admission to mithi

Preparing for the exam for admission to mithi

MEPhI (Moscow Engineering Physics Institute) is one of the first research educational institutions in Russia. For 75 years MEPhI ...

Online interest calculator

Online interest calculator

The built-in math calculator will help you carry out the simplest calculations: multiplication and addition, subtraction, and division ...

feed-image Rss