1、Chapter 10: Planning Data Analysis and Completing the Design Matrix 模块10:规划数据分析,完成成设计矩阵,2,Introduction 导言,Data Analysis Strategy 数据分析战略 Analyzing Qualitative Data 分析定性数据 Analyzing Quantitative Data 分析定量数据 Linking Quantitative Data and Qualitative Data 将定量数据与定性数据联系起来,3,Data Collection and Analysis 数据
2、收集与分析,4,Qualitative Analysis 定性分析,Best used when for in-depth understanding of the intervention 在深入理解干预时最适合使用 Answers questions like: 回答诸如此类问题: What are some of the difficulties faced by staff? 员工面临的困难有哪些? Why do participants say they dropped out early? 为什么一些参与者很早就退出了? What is the experience like fo
3、r participants? 参与者的感受是怎样的?,5,Quantitative Analysis 定量分析,Can be used to answer questions like? 可用于回答以下问题: What are the mean scores for the different groups of participants?不同参与组的平均分是多少? How do participants rate the relevance of the intervention on a scale of one to five?如果按1至5分进行打分的话,参与者对干预的相关性会打多少分
4、? How much variability is there in the responses to the item? 在对问题的回答中,变率有多少? Are the differences between the two groups statistically significant? 两组在数据上是否有很大差别?,6,Qualitative Data 定性数据,Description of program, process, and experiences 对计划、程序和经验的说明 To understand context of the situation 理解形势背景 To un
5、derstand perceptions 理解感受 Research evolves as questions emerge 研究随着问题的出现而发展 Flexible design 可行的设计,7,Qualitative Data Analysis 定性数据分析,Used for any non-numerical data collected as part of the evaluation 用于作为评价一部分收集的非数字数据 unstructured observations 未组织的考察 open-ended interviews 开放式的访谈 analysis of written
6、 documents 书面文件的分析 focus groups transcripts 专题组讨论笔录 diaries, observations 日志、观察 Analysis challenging 分析问题 Take care for accuracy (validity concern) 力求准确(对有效性的关注),8,Making Good Notes 作记录,Capture as much information as possible 捕捉尽可能多的信息 Pay close attention to language 密切关注语言 Write down observations 记
7、录下观察 Capture your immediate thoughts 捕捉你的直接思路 Leave time to write up notes immediately 留出时间来立即做详细书面记录,9,Triangulation 三元法,Can use three or more sources of information to verify and substantiate your data 可以使用三个或更多的信息来源,以检验和证明你的数据 Examples: 例子: interviews, focus groups, questionnaires 会谈,专题组讨论,调查问卷 q
8、uestionnaires, available data, expert panels 调查问卷,可利用的数据,专家小组 observations, program records, interviews 考察,计划记录,会谈 interviews, diaries, available data 会谈,日志,可利用的数据,10,Early Steps in Qualitative Analysis (1 of 3) 定性分析的早期步骤(第1页,共3页),While collecting data: 在收集数据的同时: keep good records 很好地坚持记录 write up i
9、nterview, impressions, notes from focus groups 详细书面记录下专题组讨论的会谈、印象和笔记 make constant comparisons as you progress 在进展的同时不断进行比较 meet with team regularly to compare notes and make adjustments 定期与小组会面,比较笔记并进行调整,11,Early Steps in Qualitative Analysis (2 of 3) 定性分析的早期步骤(第2页,共3页),Write contact summary report
10、 撰写联系的总结报告 one page summary after each major interview or focus group 在每次重大会谈或专题组讨论之后写一页的概述 main issues 主要问题 major information obtained 获得的主要信息 what was the most interesting, illuminating, or important? 最令人注意的、最有启发意义的或最重要的是什么? what new questions need to be explored? 需要探讨什么新问题?,12,Early Steps in Qual
11、itative Analysis (3 of 3) 定性分析的早期步骤(第3页,共3页),Use tools to help you 利用工具帮助你 create a subjectivity file with your own reactions during the study, including your feelings, hunches, and reactions 建立一份关于你自己在研究期间反应的主观性文件,包括你的感受、预感和反应 file your ideas that emerge as you proceed 把你在进展过程中出现的想法归档 keep a file o
12、f quotations from the data collection 坚持对来自数据收集的引用进行归档,13,Maintain an Iterative Dialogue 坚持反复对话,Share information early and often with key informants 早早地共享信息,并且经常是与关键的信息提供者 Have others review early drafts with the intention of eliciting information, questions, other ways of interpreting data 让其他人审查初
13、稿,旨在得出信息,问题,解释信息的其它方法,14,Reading and Coding Data 阅读数据和对数据进行编码,Read all of the data carefully 仔细阅读所有数据 Come up with names or labels for topics 对各主题进行命名和标注 Describe the topics 描述主题 Using codes, classify all of the data 利用编码,对所有数据进行分类,IPDET,14,15,Coding编码,Manually coding手工编码 use colored pens, pencils,
14、or papers 用彩色钢笔、铅笔或纸 cut with scissors to manually sort 用剪刀裁剪,以手工分类 use a number coding system 利用排号编码体系 Computer program coding电脑程序编码 word processors 文字处理器 presentation software 演示软件 databases, spreadsheets, or qualitative data analysis software 数据库、电子表格或定型数据分析软件,15,16,Drawing-out Themes and Pattern
15、s 大致说明主题与模式,As you review, begin to make notes 在你评价的同时,开始做笔记 Goal is to summarize what you have seen or heard: 目标是总结你看到的或听到的 common words 共同词语 phrases 短语 themes 主题 patterns 模式 Also identify where they are so you can find them again if you need to verify 同样确认它们的位置,这样如果你需要进行验证,能够再次找到它们 May want to use
16、 a spreadsheet 可能需要使用电子表格,17,Content Analysis 内容分析,Identify certain words or concepts in text or speech 在文本或话语中识别某些词或概念 Conceptual analysis: 概念分析 look at word frequencies 注意词的频率 Relational analysis: 关联分析 look at word frequencies注意词的频率 explore relationships among concepts寻找概念间的关系,17,18,Computer Help
17、for Qualitative Data Analysis 计算机有助于定性数据分析,Software packages to help you organize data 软件包帮助你组织数据 Search, organize, categorize, and annotate textual and visual data 搜索、组织、分类和注释文本和视觉数据 Help you visualize the relationships among data 帮助你将数据之间的关系直观化,19,Examples of QDA 定性数据分析的例子,NUD*IST Ethnograph Qualp
18、ro Hyperqual Anthropax Atlas-ti,20,Controlling for Bias 控制偏见,We tend to see what we want to see and may miss things that do not conform to our expectations 我们倾向于看到我们希望看到的,可能会漏掉那些不符合我们期望值的事物 Use well trained recorders 很好地使用受过培训的记录员 Evaluators review documents and code them in themes 评价者评价文件并把它们按主题编码,
19、21,Concluding Thoughts on Qualitative Data 定性数据的结论性思考,Qualitative data collection is not the easy option 定性数据收集不是一个简单的选择方案 labor intensive and time consuming 劳动密集型的和耗时的 reliability among coders, using a coding scheme is essential 编码员之间的可靠性,必须使用编码方案 Can reveal some of the most valuable information 能够
20、揭示出一些最宝贵的信息,22,Quantitative Data: Statistics 统计,Quantitative data are analyzed with statistics 利用统计分析定量数据 descriptive statistics: used with census or non-random sample data 描述统计:与统计数字或非随机样本数据一起使用 inferential statistics: used with random sample data 推论统计:与随机样本数据一起使用,23,Descriptive Statistics 描述统计,Des
21、cribes the frequency and/or percentage distribution of a single variable 说明一个单独变量的频率和/或百分比分布 Tells how many and what percent 说明有多少以及占多大百分比 Example: 例子: 33% of the respondents are male and 67% are female (table on next slide) 33%的被调查人是男性,67%是女性(表格见下一张幻灯片),24,Example of Descriptive Statistics in a Tab
22、le 表格中的描述统计举例,Write up: Of the 300 people in this program, 67% are women and 33% are men. 书面记录:在参与该计划的300人中,67%为女性,33%为男性。,25,Distributions 分布,Measures of central tendency 集中趋势测量 how similar are the data? 数据在多大程度上类似? example: How similar are the ages of this group of people? 例如:这一群人的年龄在多大程度上类似? Meas
23、ures of dispersion 离中趋势测量 how dissimilar are the data? 数据在多大程度上不一样? example: How much variation in the ages? 例如:年龄差异有多大?,26,Measures of Central Tendency 集中趋势测量,The 3-Ms 3M mode: most frequent response 众数:出现最频繁的响应值 median: mid-point of the distribution 中位数:分布的中点 mean: arithmetic average 平均数:算术平均数 Whi
24、ch to use depends on the type of data you have 使用哪一个取决于你的数据类型 nominal, ordinal, interval/ratio 名目,序数,间隔/比率,27,Nominal Data 名目数据,Data of names or categories 名称或类别数据 Examples: 例如: gender (male, female) 性别(男性,女性) religion (Buddhist, Christian, Jewish, Muslim) 宗教(佛教,基督教,犹太教,穆斯林) country of origin (Burma
25、, China, Ethiopia, Peru) 原产国(缅甸,中国,埃塞俄比亚,秘鲁) Use mode as a measure of central tendency 使用众数作为集中趋势的度量,28,Ordinal Data 序数数据,Data that has an order to it but the “distance” between consecutive responses is not necessarily the same 有一定顺序但是连续响应值之间的“距离”不一定相同的数据 Lacks a zero point 缺少零点 Examples: 例子 opinion
26、 scales that go from “most important” to “least important” or “strongly agree” to “strongly disagree” 从“最重要”到“最不重要”或者从“强烈同意”到“强烈不同意”的评价尺度 Use mode or median as a measure of central tendency 使用众数或中位数作为集中趋势的度量,29,Interval/Ratio Data 间隔/比率数据,Data of real numbers, numbers with a zero point and can be di
27、vided and compared into other ratio numbers 实数数据,数字有零点并且可以被分割或比较成其它比率数字 Examples: 例子: age, income, weight, height 年龄,收入,体重,身高 Use mode, median, or mean as a measure of central tendency the choice depends on the distribution 使用众数、中位数或平均数作为集中趋势的度量选择取决于分布 for normal data, mean is best 对于标准数据来说,平均数最适合 f
28、or data with few high or - few low scores, median is best 对于只有少数很高或少数很低值的数据来说,中位数最适合,30,Calculating 计算,Mode: the one with the most 众数:出现最频繁的值 Median: place in order then count down to half way 中位数:按顺序放置,然后数到中间 Mean: (most people think of it as the average) 平均数:(大多数人认为它是平均值),31,Example Data 数据例子,Tabl
29、e 11.7: Sample Data 表11.7:样本数据,32,Example Calculations for % Urban Data 计算城市百分比数据的例子,Mode: no mode, all have only one 众数:没有众数,所有都只出现过一次 Median: total entries is 6, with data in order two middle scores are (61 and 60) 2 = 60.5 中位数:共有6条,按顺序放置数据,中间两个值是(61和60) 2 = 60.5 Mean: 平均数: (65+60+41+61+58+75) 6 =
30、 60 (65+60+41+61+58+75) 6 = 6,33,Measures of Dispersion 离中趋势测量,Range 值域 difference between the highest and lowest value 最高值与最低值之间的差 simple to calculate, but not very valuable 计算简单,但是不太有价值 Standard deviation 标准差 measure of the spread of the scores around the mean 数值围绕平均数分布情况的度量 superior measure, it a
31、llows every case to have an impact on its value 优秀的度量,它让每个案例都对它的值产生影响,34,Example Calculation for Range 计算值域的例子,Range: high score low score = range 值域:最高值最低值=值域 range = 75 41 值域=7541 range = 34 值域=34,35,Normal Curve (Bell) 正常曲线(钟形),频率,36,Standard Deviation 标准差,37,Calculating Standard Deviation 计算标准差,
32、Calculating is time consuming 计算非常耗时 Can use statistical programs: 可以使用统计程序: SPSS SPSS Excel or other spreadsheet program Excel或其它电子表格程序,38,Guidelines for Analyzing Quantitative Survey Results 分析定量调查结果的指导原则,39,Common Descriptive Statistics 共同描述统计,Frequencies 频率 Percent 百分比 Mean 平均数 Median 中间数 Mode 众
33、数,Money 金钱 Percent 百分比 Ratio 比率 Comparisons 比较,40,Describing Two Variables at the Same Time 同时描述两个变量,Two variables at once 一次两个变量 Example: What percent were boys and what percent were girls in hands-on and traditional classes? 例如:在实践班和传统班里,男生和女生各占多少百分比?,41,Example Two Variables at the Same Time 同时描述
34、两个变量的例子,Source: Fabricated Data: 2004 Survey 资料来源:虚构数据:2004年调查,42,Two Variables with Crosstabs 交叉表的两个变量,Cross tabulation (crosstab) 交叉表 presented as a contingency table in a matrix format 在矩阵格式里作为列联表显示 displays two or more variables simultaneously 同时显示两个或更多个变量 each cell shows number of respondents 各
35、个单元显示出被调查人的数量,43,Example Crosstabs 交叉表的例子,N=125 Source: Fabricated Data N=125 资料来源:虚构的数据,44,Variables 变量,Independent 自变量 Variable which you believe explains a change in the dependent variable 你认为说明因变量变化的变量 Program evaluation: the program 程序评价:程序 Dependent 因变量 Variable you want to explain 你希望说明的变量 Pr
36、ogram evaluation: the outcomes 程序评价:结果,45,Example: Comparison of Means 例子:平均数的比较,-dependent variable: annual income -因变量:年收入 -independent variable: gender -自变量:性别,46,Measure of Relationship 关系的度量,How strongly variables are related, reported differently 变量之间的关系有多强,不同的报告 Measures of association 相联度量 r
37、ange from zero to 1 值域从0到1 Measures of correlation 相关性度量 range from 1 to +1 值域从-1到+1,47,Interpretation of Correlation 对相关性的解释,Measures of correlation: 相关性的度量 perfect relationship: 1 or 1 完美关系:1或-1 closer to 1 or 1: strong relationship 更接近于1或-1:强有力的关系 .5: moderate/strong (maybe as good as it gets) .5
38、:适中/强烈(可能更高) closer to zero: no relationship 更接近0:没有关系 .2 - slight/weak relationship .2轻微/微弱的关系,48,Direct Relationship 直接相关,Plus sign + 加号+ both variables change in the same direction 两个变量都向着同一个方向变化 example: 例子 as driving speed increases, death rate goes up 随着车速的增加,死亡率呈上升趋势,49,Inverse Relationship 逆
39、相关,Minus sign - 减号 both variable change but in the opposite direction 两个变量向着相反的方向变化 example: 例子: as age increases, health status decreases 随着年龄的增长,健康状况呈下降趋势,50,Inferential Statistics 推论统计,Used to analyze data from randomly selected samples 用来分析随机选择的样本数据 Risk of error because your sample may be diffe
40、rent from the population as a whole 因为你的样本可能与总体人群不同,存在着误差风险 To make an inference, you first need to estimate the probability of that error 为得出推论,你首先需要估计这一误差的概率,51,Statistical Significance Tests 统计显著性检测,Tools to estimate how likely the results are in error 估计结果出现误差可能性的工具 Called tests of statistical s
41、ignificance 称为统计显著性检测 to estimate how likely it is that you have gotten the results you see in you analysis by chance alone 估计你纯属偶然得出你的分析结果的可能性,52,Statistical Significance 统计显著性,Benchmark of .5% .5%为基准点 .05 Alpha level or P value .05 Alpha 水平或P值 It means we are 95% certain that our sample results ar
42、e not due to chance 它意味着我们95%地肯定我们的样本结果不是偶然得出的 Or 或者 The results are statistically significant at the .05 level 结果大于.05则具备统计显著性 Most reports do not go beyond .5 大多数报告不超过.5,53,Chi Square and t-Test 卡方和t检验,One of the most popular statistics 最流行的统计法之一 easy to calculate and interpret 易于计算和解释 Used to com
43、pare two sets of nominal data (i.e marital status and religious affiliation) 用来比较两组名目数据(即,婚姻状况和宗教关系) Used to compare two ordinal variables or a combination of nominal and ordinal variables 用来比较两个序数变量或名目变量与序数变量的结合,Used to determine if one group of numerical scores is statistically higher or lower tha
44、n another group of scores 用来决定一组数值是否在统计学上比另一组数值更高或更低 two means 两个平均数 post-test only, two group randomized design 仅进行事后检验,两组随机设计,54,Hypothesis Testing 假设检验,Research hypothesis is your best guess as to the relationship between variables 研究假设是你关于变量关系的最佳预测 Example: there is a difference between the per
45、capita incomes of men and women in South Africa 例子:南非男性和女性之间在人均收入上存在差异 Null hypothesis is always a statement that “there is no difference” or “no impact” between our variables 零假设永远是变量之间“不存在差异”或“无影响”的陈述 Example: there is no difference between the per capita incomes of mean and women in South Africa
46、例子:南非男性和女性之间在人均收入上不存在差异,55,Testing for Significant Difference 显著性差异检验,What is the probability of getting a R4,600 difference if we assume there is no difference in the population from which this sample was drawn? 如果我们假设抽样人群不存在任何差异,得出域值4600差数的概率是多少? If the probability is “small” for getting a R4,650
47、difference, then we reject the null hypothesis 如果得出域值4650差数的概率“小”,则我们放弃零假设。 Small is defined at less than .05 小被定义为少于.05 If the probability is”high” then reject the null hypothesis 如果概率“高”,则放弃零假设。,56,Remember: 记住,A significant test is nothing more than an estimate of the probability of getting the r
48、esults by chance if there really is no difference in the population 显著性检验无非是如果人群之中实际上不存在差异,对偶然得出结果的概率进行的估计。,57,Linking? 联系?,Should qualitative and quantitative data and associated methods be linked during study design? 在研究设计中是否应该将定性和定量数据以及相关方法联系起来? How? 如何? Why? 为什么?,58,Qualitative-Quantitative Linkages 定性-定量联系,Confirmation or corroboration triangulation 证实或确证三元法 Richer detail 更丰富的细节 Initiate new lines of thinking 开启新的思路 Expand the scope 扩大范围,