What does the negative covariance mean

Covariance

Here we explain everything to youCovariance. Then you can use the Calculate covariance and know about that Covariance formula and definition Notification. Plus, you know the difference between Covariance and Correlation and the interpretation is easy for you!

So if you have lost track of the various statistical coefficients and ask yourself "What is covariance? " you are at the right place here. Because we will show you in the Video to covariance!

Covariance definition

The covariance as a statistical quantity is a non-standardized measure of relationship for the representation of linear relationships between two cardinally scaled variables. In rare cases, the spellings Covariance or Cov(X, Y), as the term differs from the English word covariance derives.

Covariance statistics

The covariance as a statistical unit of measurement is primarily used to check the existence of a linear, monotonous relationship between two random variables. It is imperative to note that only a linear relationship between at least cardinally scaled variables can be determined, as the formula partly uses the arithmetic mean of the data records. For example, one can use covariance to examine the relationship between the number of employees in a company and the goods produced (e.g. yoghurt).

Analysis of Covariance

When dealing with covariance, it is not only important to master the calculation well and quickly, but also to be able to classify, analyze and interpret the result delivered at the same time. Only in this way can the covariance as a statistical measure of correlation exploit its full potential and also support concrete statements about the connection between two variables as computational evidence. In general, it should be remembered that a positive result in covariance calculations indicates a positive linear relationship, a negative result in turn indicates a negative linear relationship and a result equal to or close to 0 cannot be assumed to be linear, at least with regard to the relationship.

Calculate covariance

To explain the practical application of the Covarianceformula should be worked with a simplified notation that represents the formula in a fraction and is easy to use in practice.


Composition of the formula:

stands for covariance and is derived from the English of covariance from.

and stand for the expression of the random variable

and stand for the mean values ​​of the respective data sets of the x and y variables

stands for the size of the sample and is adjusted by subtracting 1 in the denominator, since the sample does not correspond to the population in many cases. When dealing with a sample-based amount of data, one also speaks of the so-called. empirical covariance. If, on the other hand, the sample is the same as the population, i.e. if a full survey took place, then only in the denominator.

Covariance calculation rules

As a practical calculation guide to do the covariance calculation quickly and easily, it is best to follow these 4 steps:

  1. First you determine the deviations for all characteristic values ​​of your X and Y variables by determining the difference between the respective values ​​and the associated mean value
  2. During a second step, you create so-called deviation products by multiplying the deviations of the associated X and Y values.
  3. Then you add up all deviation products.
  4. At the end you divide, depending on whether you are working with the entire population as the case size or with a sample, either by the number of cases () or by the number of cases minus 1 ().

Covariance example

In order to explain the calculation process to you in detail, a clear, practical example should provide a remedy: The points table of a football league. This is a steady distribution pattern. The aim is to investigate whether a linear relationship between the number of goals scored (X) and the number of points is definitely a positive linear relationship due to our previous knowledge of football.

Sample size N = 6 (population larger, since the fictitious soccer league is based on the Bundesliga and therefore comprises 18 teams, hence the denominator N-1)

Before starting the calculation, the table is expanded so that the mean values, the differences, the respective result from the product and the final value can be entered.

Calculate covariance example

The calculation is then very simple: First you determine the mean values ​​for goals and points on the basis of the table by dividing the sum of all values ​​by the number of values ​​considered. For the result is (6 + 5 + 4 + 1 + 3 + 2) / 6 = 3.5. For the value is 2.5. The deviations can now be determined using the mean values. For example, for club 1 the difference for X is 6 - 3.5 = 2.5 and the difference for Y is 6 - 2.5 = 3.5.

Care should be taken not to confuse the values. The two deviations are multiplied in the next step, so you calculate 2.5 3.5 = 8.75. The first line is now filled out. The calculation for the other clubs is the same. At the end you only have to add up all the values ​​in the last column to get the result of 21.75.

X = goalsY = pointsDiff. XDiff. Yproduct
Club 1662.53.58.75
Club 2561.53.55.25
Club 3420.5-0.5-0.25
Club 411-2.5-1.53.75
Club 530-0.5-2.51.25
Club 620-1.5-2.53.75
N = 6

This is followed by the transfer of the determined values ​​from the table to the formula. So you put the intermediate result of the summation of 22.5 in the numerator and adopt the sample size 6 for in the denominator, so that in the end there is an overall result of 4.5.

The result clearly shows that the initial idea expressed at the beginning was correct and that there is a positive linear relationship. You can find out exactly what this means and how you should interpret this result in a later section.

Covariance formula

You have already got to know the regular formula for calculation in more detail.

In addition to the covariance formula from the calculation example, you may encounter alternative notations. The classic formula for calculating covariance is made up of the expected value of the product of the deviations of the two random variables X and Y from their expected value E. It is often used for discrete distributions. As a formula, this relationship looks like this:

This formula is especially used when a distinction is to be made between continuous and discrete distribution in the calculation. Especially in the case of discrete distributions, which include urn draws or dice experiments, for example, it is important to use empirical values ​​in the calculation in order to take into account the probabilities (p).
This formula can be significantly simplified by fundamental transformation using the so-called shift theorem, so that it can contribute to the illustration of the relationships that can be determined using covariance calculations.

Displacement theorem:

The resolution of the formula by multiplication leads step by step to a simplified formula as the end product, which above all illustrates an essential characteristic of covariance: two random variables that are independent of each other are not covariant.

Covariance interpretation

Now that the practical calculation and the theoretical formula have been discussed in detail, the question naturally arises as to what the core message of the result obtained is and how it is to be interpreted in relation to the available data sets.

Positive covariance

If, at the end of your calculations, the covariance is positive, it can be deduced from this that high values ​​of the X variable under consideration are also associated with high values ​​of the Y variable. At the same time, a positive value can also mean that low values ​​for the X variable under consideration are also associated with low values ​​for the Y variable.

In any case, the decisive factor is that in the case of a positive covariance, the two variables under consideration move in the same direction.

Negative covariance

The possible results of your calculations are within the real numbers and can therefore also be negative. A negative covariance implies with regard to the variables that they move in opposite directions to each other. While the values ​​of the X variable assume a high (low) value, the Y variable in turn produces low (high) values.

Uncorrelatedness

As an alternative to the positive and negative covariance, the case that the result is 0 can always occur. From this scenario it can be deduced for the interpretation that the two characteristics considered via the variables uncorrelated and are therefore not in a linear relationship. However, uncorrelatedness does not automatically mean that there is also stochastic independence, since the relationship has been proven to be non-linear, but it can be exponential, for example.

Covariance and Correlation

By calculating and interpreting the covariance, you can make important statements about the direction and linearity of the relationship between two variables. The main weakness of this statistical variable is that no reliable assessment of the strength of the relationship can be made on the basis of the covariance.

Standardized covariance

At that point comes the correlation into play: the standardization of the covariance creates the so-called correlation, which operates in a value range between -1 and 1 and can thus also determine the strength of the linear relationship. For this reason, the covariance is often only used as part of or the basis for further correlation calculations.

Covariance correlation difference

So remember: correlation is standardized covariance. If only the direction of the relationship is required, it is completely sufficient to calculate the covariance. However, if you also want to make a statement about the strength of the linear relationship, you need the correlation coefficient for this.

Covariance matrix

The computational determination of the covariance runs exclusively through the combination in pairs, i.e. with two random variables. However, if you want to look at the relationship between more than two different variables, you can use a so-called representation Covariance matrix (also Variance-covariance matrix).

In the following you can see an exemplary distribution of a covariance matrix for three random variables:

The mapping in the covariance matrix also helps to understand an essential characteristic of covariance: its symmetry. Since the respective deviation products are added up in the formula in the numerator, the order of the variables is irrelevant. The Covariances are therefore symmetrical, which is shown by the reflection of the values ​​on the diagonal axis of variance (second matrix, from top left to bottom right).