Psychology Textbook Unit 2 Managing Data PDF Free Download

Psychology Textbook Unit 2 Managing Data PDF Download

Unit . Managing Data TOBY AND CASTRO Summary . This unit discusses the distinction between raw data and the values that are used for the subsequent analysis . The various formats and rules with regard to managing and storing data are also reviewed . Prerequisite Units Unit . Introduction to Statistics for Psychological Science Psychological Data Psychology is an empirical science . This means that psychological theories are tested by comparing their Managing Data

predictions to actual observations , which are often referred to as data . Ifthe data match the predictions , the theory survives if the data fail to confirm the predictions , the theory is falsified ( which is a fancy way of saying disproved ) Because of their crucial role , psychological data must be handled carefully and treated with respect . They must be organized and stored in a logical and standardized manner and format , allowing for their free exchange between researchers . One useful saying that summarizes empirical science is that while researchers are encouraged to argue about theory , everyone must agree on the data . In some ( very rare ) instances , the relevant data are quite simple , such as a single answer to one question that was asked ofa sample of people . In such a case , a list of or no responses would be sufficient . Usually , however , quite a lot of information is gathered from each participant . For example , besides an answer to a question , certain demographic information ( age , sex , etc . might also be collected . Alternatively , maybe a series of questions are asked . Or the questions might require more complicated responses , which could be numerical ( on a scale ofl to 10 , how attractive do you find this person to be ?

or qualitative ( what is your favorite color ?

Or maybe the participant is asked to perform a task many times , instead ofjust once , with multiple pieces of information ( response time and response choice ) recorded on every , separate trial . The variety and potential complexity of psychological data raises at least two issues . One issue concerns the storage of data should all of the data from every participant be stored together in one file or should the data from each participant be stored separately ?

Another issue concerns the distinction between what can be called raw data which are the individual observations as they were originally collected and data ( or data after ) which are the values that will be used for the formal analysis . 20 Managing Data

Data Starting with the second issue , imagine an experiment in which participants must press the left button if the stimulus is blue and the right button if the stimulus is orange . The location of the stimulus is irrelevant only the color matters but sometimes the stimulus appears on the left and sometimes it appears on the right . This task is based on work by Richard Simon of the University of Iowa , starting in the . The participants are asked to respond as quickly as possible while making very few errors . In such an experiment , it is typical for each of the four possible combinations of stimulus color and stimulus location to be used on a very large number of trials ( eg , 50 or more of each ) Thus , the raw data from a single participant would be hundreds of sets of stimulus conditions ( color and location ) with multiple response values ( which button was pressed and how long was required to make the response ) But the analysis would not concern these individual sets of trial observations instead , the raw data would be converted to a small number of summary scores ( see Unit ) such as average response time and percent correct for each of the four combinations of stimulus color and stimulus location . Note that averaging is not the only change to the data that might occur during . In the above example , the raw data were recorded in terms of stimulus color ( blue orange ) stimulus location ( left right ) response time ( in milliseconds ) and which response was made . During processing , the stimulus locations might be in terms of whether the stimulus appeared near the correct response for the trial ( which is often referred to as congruent or compatible ' or on the opposite side ( incongruent or ) because that is what most theories predict to be important Likewise , the value of which response was made Managing Data 21

would be converted to whether the response was correct . In other words , in this case is doing two things it is taking a huge number of pieces of raw data and reducing it down a few summary scores , and it is converting the data from the format of the events during the experiment to the format about which the theories make predictions . Another example of occurs when a long questionnaire is used to measure a small number of psychological constructs , such as measures of depression and anxiety . This often requires that participants provide ratings ( eg , on a scale ) of how well each of many different statements ( I often feel sad or I often quite worried ) applies to them . In this case , the raw data are the individual ratings for each of the questions , but what is needed for the analysis are the condensed scores for each ofthe small number of constructs . During , each answers to many questions are somehow combined to create these values , and then these are what are used for the analysis . Data Analyzing Data The key to the distinction between data and subsequent analysis is best thought about in terms of why each is done . The purpose of processing is to convert the raw data into the format for which the relevant theories make predictions . also simplifies matters by reducing the total number of pieces of data . Note that the theories are not yet being tested the data are only being prepared for the subsequent test . For example , very few theories make predictions about 22 Managing Data

response times on individual trials they make predictions about average response time , instead . If the theories did make predictions about individual trials , then the raw data would not be they be left as individual response times . Likewise , few theories make predictions about the answers a participant might give to a single , specific question , such as how often do you skip breakfast ?

they make predictions about the underlying psychological construct , such as depression . As above , the numerous , separate answers into one or two measures of psychological constructs is not only simplifying the data by reducing the number of values to be analyzed , it is also converting the data into the form that matches the theory . File Formats In both examples in the previous section , produces a small number from a very large amount of raw data . This brings us to the second issue raised above how many files should be used ?

To answer this question , we need to discuss the various formats for files . Psychological data are usually stored in large tables , often referred to as spreadsheets , which can be viewed and edited using various software packages , such as Excel . All of the pictures below are of parts of Excel spreadsheets . Although the details of these tables vary Managing Data 23

considerably , they all obey one simple rule each column in a spreadsheet always contains a single , specific piece of information , which does not change across rows , and each usually has a header ( a special top row ) that indicates what exactly is in every box in the column . In the case of a experiment , the raw data are usually produced by the software that runs the experiment , with each participant data in a separate file . A complicated example of this was provided at the beginning of this unit a much simpler example that matches the experiment from above is provided in Figure A a blue oz blue um sex blue um an 347 a wing Figure . Part ofa data file containing raw data . Note how each column has a label value in the first row , while each subsequent row contains the information related to a single trial . You know this , because the first column tells you how the rows are defines . It is standard good practice to do this have the first column in the spreadsheet specify what is contained in each row . In general , the raw data from a experiment will use this format . From these data , the summary values would be calculated repeated for each participant , separately and then these calculated values would placed in a new file that uses a different format 24 Managing Data

am A a ID Resp Time Resp Time pom 257 45 239 12 358 91 252 73 A was 33943 0949 as 13 0937 as 55 am 25 Figure . Data file with summary values for each participant . As was true the files , each column in this sheet contains a specific piece of information , as indicated by the row at the top . In contrast to the separate files for each , in which each subsequent row was a trial , in this ile , each row holds the data for a particular participant . This is the standard format for the data that will be used for the analysis each participant gets a row in the spreadsheet Wide Long Format The technical label for a file that places all of the values for each participant on one ( and only one ) row is wide format , because these spreadsheets can often have a very large number of columns . This is the standard in psychology . An alternative format uses multiple rows for each participant , with some columns being used to indicate the ) under which the subset of the data were collected . This is known as long Here is an example using the as previously , but now in long format Managing Data 25

Hum Insert Draw . mu Review View ' A ID Condition Resp noun pum 393 91 971 1003 pain 51004 10 Figure . Data file with the same values as in Figure , but in along format . Long format is rarely used and the software packages that still employ this format for certain analyses usually have a procedure for converting one format to the other . For this reason , and to maintain consistency , psychological data are almost always stored in wide format . contrast to experiments , in which each participant performs hundreds of trials and separate files are used for each participant raw data , all of the data from a questionnaire study is usually stored in one file . These files use the wide format each item on the questionnaire gets a separate column , each participant get one row , and the first row in the file provides the label for each ofthe columns . When condensed scores are calculated ( a measure ) based on the raw data in several columns , these can be added to the same file as a new , separate column , or placed in a new file that holds only the values that are needed for the subsequent analysis . 26 Managing Data

Data Storage Before anything else , here is the cardinal rule never throw any information ) keep all of the raw data , even if you currently have a need or use for them . There are two main reasons for this first , you might discover a use for these data later second , someone else might already have a use for them . It might also appear a bit suspicious ifyou collect certain data and then delete them , making them unavailable to others who might ask about them . This rule applies to all ofthe data as originally collected . If , for example , you omit a few trials of a experiment during , maybe because the response was abnormally fast or slow ( ie , an outlier ) you do not delete the trial from the file you merely skip over it during processing . Likewise , if you decide to omit all of the data from a certain participant , maybe because they did not appear to follow instructions , you still keep the file that holds their raw data you just do include their values in the file for subsequent analysis . In situations where you have large amounts of raw data , such as experiments or long questionnaires , the files that contain the raw data can stored separately from the single , small file that holds the values that are ready for analysis . As discussed above , the files might use a format that is different from the final file that is fine , as long as the header row in every file makes it clear contained in each column . If condensed scores were added to a file , as is often done with questionnaire data , you can save two versions , if you wish one with everything and another with only the final values that are needed for the analysis that , also , is fine , as long as keep all of the raw data somewhere . Managing Data 27

Psychology Textbook Unit 2 Managing Data

Subjects

Grade Levels

Resource Type

Psychology Textbook Unit 2 Managing Data PDF Download