Stata Press | A Gentle Introduction to Stata, Revised Third Edition

Alan C. Acock’s A Gentle Introduction to Stata, Revised Third Edition is aimed at new Stata users who want to become proficient in Stata. After reading this introductory text, new users not only will be able to use Stata well but also will learn new aspects of Stata easily.

Acock assumes that the user is not familiar with any statistical software. This assumption of a blank slate is central to the structure and contents of the book. Acock starts with the basics; for example, the portion of the book that deals with data management begins with a careful and detailed example of turning survey data on paper into a Stata-ready dataset on the computer. When explaining how to go about basic exploratory statistical procedures, Acock includes notes that will help the reader develop good work habits. This mixture of explaining good Stata habits and good statistical habits continues throughout the book.

Acock is quite careful to teach the reader all aspects of using Stata. He covers data management, good work habits (including the use of basic do-files), basic exploratory statistics (including graphical displays), and analyses using the standard array of basic statistical tools (correlation, linear and logistic regression, and parametric and nonparametric tests of location and dispersion). Acock teaches Stata commands by using the menus and dialog boxes while still stressing the value of do-files. In this way, he ensures that all types of users can build good work habits. Each chapter has exercises the motivated reader can use to reinforce the material.

The tone of the book is friendly and conversational without ever being glib or condescending. Important asides and notes about terminology are set off in boxes, which makes the text easy to read without any convoluted twists or forward-referencing. Rather than splitting topics by their Stata implementation, Acock arranges the topics as they would appear in a basic statistics textbook; graphics and postestimation are woven into the material in a natural fashion. Real datasets, such as the General Social Surveys from 2002 and 2006, are used throughout the book.

The focus of the book is especially helpful for those in psychology and the social sciences because the presentation of basic statistical modeling is supplemented with discussions of effect sizes and standardized coefficients. Various selection criteria, such as semipartial correlations, are discussed for model selection.

The revised third edition of the book has been updated to reflect the new features available in Stata 12 and Stata 11. The ANOVA chapter has been revised to incorporate the pwmeans command, to do mean comparisons, and the marginsplot command, which simplifies the construction of graphs showing interaction effects. Menus and screenshots have also been updated. As in the third edition, an entire chapter is devoted to the analysis of missing data and the use of multiple-imputation methods. Factor-variable notation is introduced as an alternative to the manual creation of interaction terms. The new Variables Manager and revamped Data Editor are featured in the discussion of data management.

1.1 Conventions
1.2 Introduction
1.3 The Stata screen
1.4 Using an existing dataset
1.5 An example of a short Stata session
1.6 Summary
1.7 Exercises

2.1 Creating a dataset
2.2 An example questionnaire
2.3 Develop a coding system
2.4 Entering data using the Data Editor

2.4.1 Value labels

2.5 The Variables Manager
2.6 The Data Editor (Browse) view
2.7 Saving your dataset
2.8 Checking the data
2.9 Summary
2.10 Exercises

3.1 Introduction
3.2 Planning your work
3.3 Creating value labels
3.4 Reverse-code variables
3.5 Creating and modifying variables
3.6 Creating scales
3.7 Save some of your data
3.8 Summary
3.9 Exercises

4.1 Introduction
4.2 How Stata commands are constructed
4.3 Creating a do-file
4.4 Copying your results to a word processor
4.5 Logging your command file
4.6 Summary
4.7 Exercises

5.1 Descriptive statistics and graphs
5.2 Where is the center of a distribution?
5.3 How dispersed is the distribution?
5.4 Statistics and graphs—unordered categories
5.5 Statistics and graphs—ordered categories and variables
5.6 Statistics and graphs—quantitative variables
5.7 Summary
5.8 Exercises

6.1 Relationship between categorical variables
6.2 Cross-tabulation
6.3 Chi-squared test

6.3.1 Degrees of freedom
6.3.2 Probability tables

6.4 Percentages and measures of association
6.5 Odds ratios when dependent variable has two categories
6.6 Ordered categorical variables
6.7 Interactive tables
6.8 Tables—linking categorical and quantitative variables
6.9 Power analysis when using a chi-squared test of significance
6.10 Summary
6.11 Exercises

7.1 Introduction to tests for one or two means
7.2 Randomization
7.3 Random sampling
7.4 Hypotheses
7.5 One-sample test of a proportion
7.6 Two-sample test of a proportion
7.7 One-sample test of means
7.8 Two-sample test of group means

7.8.1 Testing for unequal variances

7.9 Repeated-measures t test
7.10 Power analysis
7.11 Nonparametric alternatives

7.11.1 Mann–Whitney two-sample rank-sum test
7.11.2 Nonparametric alternative: Median test

7.12 Summary
7.13 Exercises

8.1 Introduction to bivariate correlation and regression
8.2 Scattergrams
8.3 Plotting the regression line
8.4 Correlation
8.5 Regression
8.6 Spearman’s rho: Rank-order correlation for ordinal data
8.7 Summary
8.8 Exercises

9.1 The logic of one-way analysis of variance
9.2 ANOVA example
9.3 ANOVA example using survey data
9.4 A nonparametric alternative to ANOVA
9.5 Analysis of covariance
9.6 Two-way ANOVA
9.7 Repeated-measures design
9.8 Intraclass correlation—measuring agreement
9.9 Summary
9.10 Exercises

10.1 Introduction to multiple regression
10.2 What is multiple regression?
10.3 The basic multiple regression command
10.4 Increment in R-squared: Semipartial correlations
10.5 Is the dependent variable normally distributed?
10.6 Are the residuals normally distributed?
10.7 Regression diagnostic statistics

10.7.1 Outliers and influential cases
10.7.2 Influential observations: DFbeta
10.7.3 Combinations of variables may cause problems

10.8 Weighted data
10.9 Categorical predictors and hierarchical regression
10.10 A shortcut for working with a categorical variable
10.11 Fundamentals of interaction
10.12 Power analysis in multiple regression
10.13 Summary
10.14 Exercises

11.1 Introduction to logistic regression
11.2 An example
11.3 What is an odds ratio and a logit?

11.3.1 The odds ratio
11.3.2 The logit transformation

11.4 Data used in rest of chapter
11.5 Logistic regression
11.6 Hypothesis testing

11.6.1 Testing individual coefficients
11.6.2 Testing sets of coefficients

11.7 Nested logistic regressions
11.8 Power analysis when doing logistic regression
11.9 Summary
11.10 Exercises

12.1 Overview of reliability and validity
12.2 Constructing a scale

12.2.1 Generating a mean score for each person

12.3 Reliability

12.3.1 Stability and test–retest reliability
12.3.2 Equivalence
12.3.3 Split-half and alpha reliability—internal consistency
12.3.4 Kuder–Richardson reliability for dichotomous items
12.3.5 Rater agreement—kappa (K)

12.4 Validity

12.4.1 Expert judgment
12.4.2 Criterion-related validity
12.4.3 Construct validity

12.5 Factor analysis
12.6 PCF analysis

12.6.1 Orthogonal rotation: Varimax
12.6.2 Oblique rotation: Promax

12.7 But we wanted one scale, not four scales

12.7.1 Scoring our variable

12.8 Summary
12.9 Exercises

13.1 The nature of the problem
13.2 Multiple imputation and its assumptions about the mechanism for missingness
13.3 What variables do we include when doing imputations?
13.4 Multiple imputation
13.5 A detailed example

13.5.1 Preliminary analysis
13.5.2 Setup and multiple-imputation stage
13.5.3 The analysis stage
13.5.4 For those who want an R² and standardized βs
13.5.5 When impossible values are imputed

13.6 Summary
13.7 Exercises

A.1 Introduction to the appendix
A.2 Resources

A.2.1 Web resources
A.2.2 Books about Stata
A.2.3 Short courses
A.2.4 Acquiring data

A.3 Summary

Stata Press books

A Gentle Introduction to Stata, Revised Third Edition

Comment from the Stata technical group

Table of contents

	A Gentle Introduction to Stata, Revised Third Edition Alan C. Acock Copyright 2012 ISBN-13: 978-1-59718-109-9 Pages: 401; paperback Price $48.00

See the back cover Table of contents Preface (pdf) Author index (pdf) Subject index (pdf) Download the datasets used in this book Errata Obtain answers to the exercises Download the brochure (pdf) Review of the first edition from the Stata Journal