Psychology Textbook Unit 7 Correlational Measures

Explore the Psychology Textbook Unit 7 Correlational Measures study material pdf and utilize it for learning all the covered concepts as it always helps in improving the conceptual knowledge.

Subjects

Social Studies

Grade Levels

K12

Resource Type

PDF

Psychology Textbook Unit 7 Correlational Measures PDF Download

Unit . Correlational Measures CASTRO AND TOBY Summary . In this unit , we will start analyzing how two different variables may relate to one another . We will concentrate on Pearson correlation coefficient how it is calculated , how to interpret it , and different issues to consider when using it to measure the relationship between two variables . Prerequisite Units Unit . Introduction to Statistics for Psychological Science Unit . Managing Data Unit . Descriptive Statistics for Psychological Research Unit . to Statistical Significance Measures of Association On prior units , we have focused on statistics related to one specific variable the mean ofa variable , the standard deviation of a variable , how the values of a variable are distributed , etc But often we want to know , not only about one variable , but also how one variable may be related to another variable . So , in this unit we will focus on analyzing the possible association between two variables . Choice of the appropriate measure of association depends Unit ' Measures 71

on the types of variables that we are analyzing that is , it depends on whether the variables of interest are quantitative , ordinal , or categorical , and on the possible range of values of these variables . For example , we may be interested in the differences in healthy lifestyle scores ( a continuous quantitative variable ) between people who graduated from college and people who did not graduate from college ( a categorical variable ) Note that this categorical variable has only two values because of that , it is called a dichotomous variable . In this situation , when one variable consists of numerical , continuous scores , and the other variable has only two values , we use the correlation to measure the relationship between the variables . In other cases , we are interested in how two categorical variables are related . For example , we may want to examine whether ethnic origin is associated to marital status ( single , partnership , married , divorced ) In this situation , when the two variables are categorical , you use a as the measure We could also be interested in the relationship between ordinal variables . If , for example , we want to see if there is a relationship between student rankings in a language test and in a music test , the two variables are ordinal or ranked variables . In this case , when two variables are measured on an ordinal scale , we should use Spearman correlation to measure the strength and direction of the association between the variables . You need to know that these possibilities exist . However , in this unit we are going to focus on the analysis of the relationship between two variables when both variables are quantitative . In this case , we typically use correlation coefficient . For example , when we want to see if there is a relationship between time spent in social media and scores in an anxiety scale But before looking at this statistical analysis in detail , let clarify a few issues . 72 Unit ' Correlational Measures

Correlation as a Statistical Technique First of all , we need to distinguish between correlational research and correlational statistical analyses . In a correlational research study , we measure two or more variables as they occur naturally to determine if there is a relationship between them . We may read that as people make more money , they spend less time socializing with others , or children with social and emotional difficulties in homes are more likely to be given mobile technology to calm them down , or spatial reasoning predicts later math skills . The studies supporting these conclusions are correlational because the researcher is measuring the variables making money , time socializing , social and emotional difficulties , and spatial reasoning skills , rather than manipulating them that is , the researcher is measuring these variables in the real word , rather than determining possible values of these variables and assigning groups to specific values . When we say that a study is correlational in nature we are referring to the study design , not the statistics used for analysis . In some situations , the results from a correlational study are analyzed using something other than a correlational statistic . Conversely , the results from some experiments , in which one of the variables was determined by the experimenter , such as the number of dots on a computer screen , can be analyzed using a correlational statistic . Thus , it is important to understand when is appropriate to use a correlational statistical technique . When we analyze a correlation , we normally are looking at the relationship between two numerical variables , so that we have scores for each participant . If we want to see whether spatial skills at the age correlate with mathematical skills at the age , then we will have one score for spatial skills at , and one score for mathematical skills at 12 , for each individual participating in our study . Unit . Correlational Measures 73

These scores can be represented graphically in a , In a , each participant in the study is represented by a point in space . The coordinates ofthis point are the participant scores on variables ( in the horizontal axis ) and ( in the vertical axis ) How we decide which variable goes on the abscissa ( is ) or on the ordinate ( is ) depends on how we interpret the underlying relationship between our variables . We may have a predictor variable that is related to an outcome ( also called response or criterion variable ) as in the case of spatial skills at the age ( predictor ) and mathematical skills at the age ( outcome ) In this case , the predictor variable is represented on the is , whereas the outcome variable is represented on the is , as shown in Figure . This distinction between predictor and outcome variables may be obvious in some cases for example , smoking will be the predictor and whether or not lung cancer develops will be the outcome , or healthy diet will be the predictor and cardiovascular disease the outcome . But may not always be for example , time to complete a task may be related to accuracy in that task , but it is not clear whether accuracy depends on time or time depends on accuracy . In this latter case , it does not matter which variable is represented on the or the . However , when creating a , be aware that most people will have a tendency to interpret the variable in the as the one leading to the variable in the axis . A provides you with a quick visual , intuitive way to assess the correlation between the two variables . It could be that there is no correlation , or the correlation is positive , or negative , as illustrated in Figure . 74 Unit . Correlational Measures

Math Skills Function Skills an social . I I I . I I . Height Figure . Three depicting different types . On the top left , a showing a positive relationship between spatial skills at the age and mathematical skills at the age of 72 . On the top right , a showing a negative relationship between social isolation and cognitive functioning in the elderly . On the bottom , a showing no relationship between height and intelligence . The straight lines show the trend ofthe linear relationship between the variables . If spatial skills in olds are related to mathematical skills at the age of 12 , so that children who at the age of show poor spatial skills tend to show poor mathematical skills eight years later , and children who at the age of show good spatial skills tend to show good mathematical skills eight years later , then there is a positive correlation between spatial skills at the age of and mathematical skills at the age of 12 . This positive correlation , where the two variables tend to change in the same direction , is represented on the top left panel . On the top right panel we see an example of a negative correlation . In this case , it was observed that , in the elderly , the higher Unit . Correlational Measures 75

the social isolation , the lower their cognitive functioning . The correlation is negative because the two variables go in opposite directions as social isolation increases , cognitive functioning decreases . On the bottom panel , height and intelligence are depicted . In this case , the data points are distributed without showing any trend , so there is no correlation between the variables . Pearson Correlation Coefficient Visual inspection of a can give us a very good idea of how two variable are related . But we need to conduct a statistical analysis to confirm a visual impression or , sometimes , to uncover a relationship that may not be very obvious . There are different correlational analysis , depending on the type of relationship between the variables that we want to analyze . When we have two quantitative variables that change that when one of the variables increases , the other variable increases as well , or when one of the variables increases , the other variable most commonly used correlational analysis is Pearson moment correlation , typically named correlation coefficient or Pearson orjust In order to use Pearson correlation coefficient . Both variables must be quantitative . If the variables are numerical , but measured along an ordinal scale , Spearman coefficient should be used , instead . The relationship between the two variables should be linear . Pearson correlation coefficient can only detect and quantify linear ( relationships . If the data in the show some kind of curvilinear trend , then the relationship is not linear and a more 76 Unit . Correlational Measures

complicated procedure should be used , instead . Pearson correlation coefficient measures the direction and the degree of the linear relationship between two variables . The value range from ( perfect positive correlation ) to ( perfect negative correlation ) of means that there is no correlation . The sign of tells you the direction ( positive or negative ) of the relationship between variables . The magnitude ofr tells you the degree of relationship between variables . Let say that we obtain a correlation coefficient of between physical activity ( exercise hours per week ) and scores on an academic test . What does it mean ?

Since 083 is positive and close to 100 , you can say that the two variables have a strong positive relationship high number of exercise hours per week are related to high scores on the academic test . In a different situation , let say that we obtain a correlation coefficient of between number of alcoholic drinks per week and scores on an academic test . What does it mean ?

Since is negative and close to , you can say that the two variables have a strong negative relationship high number of alcoholic drinks per week are related to low scores on the academic test . A slightly more complicated way to quantify the strength of the linear relationship is by using the square of correlation . The reason why this is sometimes preferred is because is the proportion of the variability in the outcome variable that can be explained ' by the value of the predictor variable . If we obtain an of when analyzing the relationship between spatial skills at the age of Unit . Correlational Measures 77

( predictor ) and mathematical skills at the age of 12 ( outcome ) we will say that spatial skills at the age of explain 23 of the variability in mathematical skills at the age of 12 . This amount of explained variance is one way of expressing the degree to which some relation or phenomenon is present . Importantly , when you use as your measure of strength , you can make statements like verbal working memory score is twice as good at predicting IQ than spatial working memory score ( assuming that the for verbal and IQ is twice as large as the for spatial and IQ ) You can not make statements of this sort based on ( values of Conceptually , Pearson correlation coefficient computes the degree to which change in one numerical variable is associated with change in another numerical variable . It can be described in terms of the covariance of the variables , a measure of how two variables vary together . When there is a perfect linear relationship , every change in the variable is accompanied by a corresponding change in the variable . The result is a perfect linear relationship , with and always varying together . In this case , the ( of and together ) is identical to the variability of and separately , and will be positive ( if the two variables increase together ) or negative ( if increases in one variable correspond to decreases in the other variable ) To understand the calculations for , we need to understand the concept of sum of products of deviations ( It is very 78 Unit . Correlational Measures

similar to the concept of squared deviations ( that we saw in to calculate the variance and standard deviation . This was the formula for the of one single variable , In order to see the similarities with the , it will be even clearer if we write the formula for the this way ( The formula for the sum of the products of the deviation scores or computes the deviations of each score for and for from its corresponding mean , and then multiplies and add those values 31 ( Xi ) In order to show the degree to which and vary together , that is , their covariance ( similar to the variance for one variable , but now referring to two variables ) we divide by ) Practice ( Let calculate the and the covariance with the containing the number of Unit . Correlational Measures 79

80 Unit . Correlational Measures ( 22 ( 78 ( 85486 11 80 ( 11 ) 852 16 89 ( 16 ) 14 85 ( 14 ) 866 ) 12 84 ( 12 ) 15 86 ( 15 ) 18 95 ( 18 ) 86 ) 20 96 ( 20 ) 86 ) 10 83 ( 10 ) 81 ( 16 93 ( 16 ) 866136 17 92 ( 17 ) 13 84 ( 13 ) 12 83 ( 12 ) Unit ' Measures 81 ( Xi 22 ) Xi ( 88 8646 ) I 14 88 ( 14 ) Now , as the term ) in the formula indicates , we need to add all the products of each pair of deviation scores , the scores in the rightmost column , This total adds up to . Then , we divide by , that is , by 14 Thus , the covariance between Hours and Grade is 1795 . It could be possible to use the covariance between and as a measure of the relationship between the two variables however , its value is not quickly understandable or possible to compare across studies because it depends on the measurement scale of the variables , that is , it depends on the specific units of and the specific units of correlation coefficient solves this issue , by dividing the covariance by the specific value of the standard deviations of and , so that the units ( and scale effects ) cancel sway 82 Unit . Correlational Measures

With this maneuver , the limits of range between and and , therefore , is easy to interpret , and its value can be used to compare different studies . Practice ( Let calculate Pearson correlation coefficient in our case . We know from that the standard deviation of , the number of hours , is , and that the standard deviation of , the grade , is . Therefore So , Pea correlation coefficient between number of hours studying and grade obtained in our is . Very high . As indicated above , the magnitude of tells us how weak or strong the relationship is so , the closer to , the weaker the relationship is , whereas the closer to ( or ) the stronger the relationship is . Figure shows different illustrating different values of Unit . Correlational Measures 83

Figure . Three depicting positive relationships of different strength between variables and On the left , in which . On the center , in which . On the right , in which . A perfect correlation of or means that all the data points lie exactly on a straight line . These perfect relationships are rare , in general , and very unlikely in psychological research . Interpretation of Pearson Correlation Coefficient The interpretation of the value of the correlation coefficient is somehow arbitrary ( see Table for typical guidelines ) Although most data scientists will agree that an than reflects a negligible relationship , and an larger than reflects a very strong relationship , how to interpret intermediate coefficients is more uncertain . An value of may be weak or strong depending on the typical or possible association found between some given variables . For example height and weight are typically highly correlated , so an of between those variables would be low however , an value of between eating cranberries daily and cognitive capacity would be high , given that so many other variables are related to cognitive capacity . It also may be weak or strong depending on the results of other research studies in the same 84 Unit . Correlational Measures

area . Thus , a specific value should be interpreted within the context of the specific research . Value ofr Strength to negligible to weak to moderate to strong to ' very strong Table 71 . Typical guidelines for the interpretation of Issues to Consider Outliers An outlier is a data point that has a value much larger or smaller than the other values in a data set . It is important to carefully examine the for outliers because they can have an excessive influence on Pearson correlation coefficient . For example , in Figure , we can see a data point that has and much larger than the other data points . Pearson this entire data set is , indicating a strong relationship between and . However , if we do not include the outlier in the analysis , Pearson is greatly reduced , very close to , indicating a negligible relationship . You should be able to easily visualize the difference with and without the outlier data point in the in Figure 73 . Unit . Correlational Measures 85

Figure 73 . of the relationship between two variables , and , with one er in the data set . You always need to examine the distribution of your scores to make sure that there are no relevant outliers . Whether outliers should or should not be eliminated from an analysis depends on the nature of the outlier . If it is a mistake in the data collection , it should probably be eliminated if it is an unusual , but still possible value , it may need to be retained . One way or the other , the presence of the outlier should be noted in your data report . If the outlier must be retained , an alternative analysis can be Spearman correlation , that is more robust than Pearson correlation coefficient against outliers . 86 Unit . Correlational Measures

Restricted Range You need to be aware that the value of a correlation can be influenced by the range of scores in the used . If a correlation coefficient is calculated from a set of scores that do not represent the full range of possible values , you need to be cautious in interpreting the correlation coefficient . For example , you may be interested in the relationship between family income level and educational achievement . You choose a convenient sample in a nearby school , that happens to be a private school in which most students come from wealthy and very wealthy families . You analyze the data in your sample and find that there is no correlation between family income level and educational achievement . It could be that , indeed , this relationship is not apparent among students , but it wou have been revealed if you have included in your sample students from , low , and families . Figure shows a that depicts this issue . out Figure . Considering all the possible values for and Vin this , However , ifthe data set consisted ofa restricted range of values , as those high values included in the circle , would be close to . Unit . Correlational Measures 87

In general , in order to establish whether or not a correlation exists between two variables , you need a wide range of values for each of the variables . If it is not possible to obtain this wide range of values , then at least you should limit your interpretation to the specific range of values in your Correlation Coefficient in the Sample and in the Population We rarely are interested in the correlation between variables that only exists in a sample . As is true for most situations , we use the correlation coefficient in our sample to make an estimate of the correlation in the population . How does that work ?

The sample data allow us to compute , the correlation coefficient for the sample We normally do not have access to the entire population , so we can not know or calculate the correlation coefficient for the population ( named rho , Because of this , we use the sample correlation coefficient , as our estimate ofthe unknown population correlation coefficient . The accuracy of this estimation depends on two things . First , the sample needs to be representative of the population for example , the people included in the sample need to be an unbiased , random subset of the is not a statistical issue , but it should always be kept in mind . Second , how precisely a sample correlation coefficient will match the population correlation coefficient depends on the size of the sample the larger the sample , the more accurate the estimate ( provided that the sample is representative ) 88 Unit . Correlational Measures

Whenever some value that is calculated from a sample is used to estimate the ( true ) value in the population , the question of bias arises . Recall that bias is whether the estimator has a tendency to shoot or the target value . If possible , we always use the estimator that has the least bias . In the case of correlation coefficients , the value of from a sample provides a very good estimate of in the population , as long as the sample size is not very small . If you are working with samples of fewer than 30 participants , you may wish to adjust the value of when using it as an estimate of The details of this adjustment are beyond the scope of this unit , but you should be aware that better estimates of are available for situations . Before conducting our study , we need to decide the size of our sample . This decision must be informed by the purpose of minimizing inaccurate estimates when we later analyze our sample data . So , we need to plan for the sufficient sample size . It is not easy to give a specific number , because our sample size should be based on prior and expected sizes of the relationship of interest . And , the size of the sample should be large enough to be able to detect small effects , and make sure that our results are not due to chance . Nowadays , there are a variety of software tools that can help you decide the size of the sample At the moment , just be aware that , if you have a sample that is too small , the correlation coefficient that you obtain from your sample data may be inadequate as the correlation coefficient for the population . Unit . Correlational Measures 89

In addition , in order to improve the interpretation of your correlation coefficient , a confidence interval will help . why is always advisable to include a confidence interval for the obtained coefficient ( typically , a 95 confidence interval ) The confidence interval provides the range of likely values of the coefficient in the population from which the sample was taken . An value of suggests quite a strong relationship between two variables however , if the 95 confidence interval ranges from to ( as it could be the case with a very small sample ) then the strength of the relationship in the population could be negligible ( and , therefore , of little importance , or it could be strong ( I ) and , therefore , of high relevance . So , if the confidence interval is very wide , it is difficult to make a valuable interpretation of the results . In general , a narrower confidence interval will allow us for a more accurate estimation ofthe correlation in the population . Conclusions A correlation coefficient shows the strength and direction of an association between two variables . Note that a correlation describes a relationship between two variables , but does not explain why . Thus , it should never be interpreted as evidence ofa causal relationship between the variables . If is relatively strong , you can assume that when one variable increases , the other variable will increase as well ( for a positive relation ) or the other variable will decrease ( for a negative relation ) But not allow you to predict , precisely , the value of one variable based on the value of the other variable . To do that , we have another statistical tool regression analysis Unit 90 Unit . Correlational Measures