## A Gentle Introduction to Stata, Fourth EditionAlan C. AcockCopyright 2014 ISBN-13: 978-1-59718-142-6 Pages: 468; paperback Price $54.00 | |

See the back cover Table of contents Previous edition Preface (pdf) Author index (pdf) Subject index (pdf) Download the datasets used in this book Errata Download the brochure (pdf) Review of the first edition from the Stata Journal |

Alan C. Acock’s *A Gentle Introduction to Stata, Fourth Edition* is aimed at
new Stata users who want to become proficient in Stata. After reading this
introductory text, new users will not only be able to use Stata well but
will also learn new aspects of Stata. Acock assumes that the user is
not familiar with any statistical software. This assumption of a blank
slate is central to the structure and contents of the book. Acock starts
with the basics; for example, the portion of the book that deals with data
management begins with a careful and detailed example of turning survey data
on paper into a Stata-ready dataset on the computer. When explaining how to
go about basic exploratory statistical procedures, Acock includes notes that
will help the reader develop good work habits. This mixture of explaining
good Stata habits and good statistical habits continues throughout the book.

Acock is quite careful to teach the reader all aspects of using Stata. He covers data management, good work habits (including the use of basic do-files), basic exploratory statistics (including graphical displays), and analyses using the standard array of basic statistical tools (correlation, linear and logistic regression, and parametric and nonparametric tests of location and dispersion). He also successfully introduces some more advanced topics such as multiple imputation and structural equation modeling in a very approachable manner. Acock teaches Stata commands by using the menus and dialog boxes while still stressing the value of do-files. In this way, he ensures that all types of users can build good work habits. Each chapter has exercises that the motivated reader can use to reinforce the material.

The tone of the book is friendly and conversational without ever being glib
or condescending. Important asides and notes about terminology are set off
in boxes, which makes the text easy to read without any convoluted twists or
forward referencing. Rather than splitting topics by their Stata
implementation, Acock arranges the topics as they would appear in a basic
statistics textbook; graphics and postestimation are woven into the material
in a natural fashion. Real datasets, such as the *General Social Surveys* from
2002 and 2006, are used throughout the book.

The focus of the book is especially helpful for those in psychology and the social sciences because the presentation of basic statistical modeling is supplemented with discussions of effect sizes and standardized coefficients. Various selection criteria, such as semipartial correlations, are discussed for model selection. Acock also covers a variety of commands available for evaluating reliability and validity of measurements.

The fourth edition has been updated to include new features in
Stata 13. Effect-size computation is performed using the **esize** and **estat
esize** commands. Power and sample-size analysis for two-sample tests of
means, as well as one-way, two-way, and repeated measures ANOVA models, is
demonstrated using the **power** suite of commands. The multiple regression
chapter includes a new section on modeling quadratic relationships. The
chapter on logistic regression contains new material on examining effects of
predictors using **margins** and **marginsplot**.
A newly added chapter is devoted to Stata’s **sem** and **gsem**
commands for structural equation modeling. This
chapter focuses on fitting linear and logistic regression models, thinking
of these models in terms of path diagrams, and expanding the capabilities of
**regress** and **logistic** using **sem** and **gsem**.
After covering models with one response variable, Acock extends these concepts to performing path analysis.

List of figured

List of tables

List of boxed tips

Preface
(pdf)

Support materials for the book

1 Getting started

1.1 Conventions

1.2 Introduction

1.3 The Stata screen

1.4 Using an existing dataset

1.5 An example of a short Stata session

1.6 Summary

1.7 Exercises

1.2 Introduction

1.3 The Stata screen

1.4 Using an existing dataset

1.5 An example of a short Stata session

1.6 Summary

1.7 Exercises

2 Entering data

2.1 Creating a dataset

2.2 An example questionnaire

2.3 Develop a coding system

2.4 Entering data using the Data Editor

2.6 The Data Editor (Browse) view

2.7 Saving your dataset

2.8 Checking the data

2.9 Summary

2.10 Exercises

2.2 An example questionnaire

2.3 Develop a coding system

2.4 Entering data using the Data Editor

2.4.1 Value labels

2.5 The Variables Manager2.6 The Data Editor (Browse) view

2.7 Saving your dataset

2.8 Checking the data

2.9 Summary

2.10 Exercises

3 Preparing data for analysis

3.1 Introduction

3.2 Planning your work

3.3 Creating value labels

3.4 Reverse-code variables

3.5 Creating and modifying variables

3.6 Creating scales

3.7 Save some of your data

3.8 Summary

3.9 Exercises

3.2 Planning your work

3.3 Creating value labels

3.4 Reverse-code variables

3.5 Creating and modifying variables

3.6 Creating scales

3.7 Save some of your data

3.8 Summary

3.9 Exercises

4 Working with commands, do-files, and results

4.1 Introduction

4.2 How Stata commands are constructed

4.3 Creating a do-file

4.4 Copying your results to a word processor

4.5 Logging your command file

4.6 Summary

4.7 Exercises

4.2 How Stata commands are constructed

4.3 Creating a do-file

4.4 Copying your results to a word processor

4.5 Logging your command file

4.6 Summary

4.7 Exercises

5 Descriptive statistics and graphs for one variable

5.1 Descriptive statistics and graphs

5.2 Where is the center of a distribution?

5.3 How dispersed is the distribution?

5.4 Statistics and graphs—unordered categories

5.5 Statistics and graphs—ordered categories and variables

5.6 Statistics and graphs—quantitative variables

5.7 Summary

5.8 Exercises

5.2 Where is the center of a distribution?

5.3 How dispersed is the distribution?

5.4 Statistics and graphs—unordered categories

5.5 Statistics and graphs—ordered categories and variables

5.6 Statistics and graphs—quantitative variables

5.7 Summary

5.8 Exercises

6 Statistics and graphs for two categorical variables

6.1 Relationship between categorical variables

6.2 Cross-tabulation

6.3 Chi-squared test

6.5 Odds ratios when dependent variable has two categories

6.6 Ordered categorical variables

6.7 Interactive tables

6.8 Tables—linking categorical and quantitative variables

6.9 Power analysis when using a chi-squared test of significance

6.10 Summary

6.11 Exercises

6.2 Cross-tabulation

6.3 Chi-squared test

6.3.1 Degrees of freedom

6.3.2 Probability tables

6.4 Percentages and measures of association6.3.2 Probability tables

6.5 Odds ratios when dependent variable has two categories

6.6 Ordered categorical variables

6.7 Interactive tables

6.8 Tables—linking categorical and quantitative variables

6.9 Power analysis when using a chi-squared test of significance

6.10 Summary

6.11 Exercises

7 Tests for one or two means

7.1 Introduction to tests for one or two means

7.2 Randomization

7.3 Random sampling

7.4 Hypotheses

7.5 One-sample test of a proportion

7.6 Two-sample test of a proportion

7.7 One-sample test of means

7.8 Two-sample test of group means

7.10 Power analysis

7.11 Nonparametric alternatives

7.13 Exercises

7.2 Randomization

7.3 Random sampling

7.4 Hypotheses

7.5 One-sample test of a proportion

7.6 Two-sample test of a proportion

7.7 One-sample test of means

7.8 Two-sample test of group means

7.8.1 Testing for unequal variances

7.9 Repeated-measures t test7.10 Power analysis

7.11 Nonparametric alternatives

7.11.1 Mann–Whitney two-sample rank-sum test

7.11.2 Nonparametric alternative: Median test

7.12 Summary7.11.2 Nonparametric alternative: Median test

7.13 Exercises

8 Bivariate correlation and regression

8.1 Introduction to bivariate correlation and regression

8.2 Scattergrams

8.3 Plotting the regression line

8.4 An alternative to producing a scattergram, binscatter

8.5 Correlation

8.6 Regression

8.7 Spearman’s rho: Rank-order correlation for ordinal data

8.8 Summary

8.9 Exercises

8.2 Scattergrams

8.3 Plotting the regression line

8.4 An alternative to producing a scattergram, binscatter

8.5 Correlation

8.6 Regression

8.7 Spearman’s rho: Rank-order correlation for ordinal data

8.8 Summary

8.9 Exercises

9 Analysis of variance

9.1 The logic of one-way analysis of variance

9.2 ANOVA example

9.3 ANOVA example using survey data

9.4 A nonparametric alternative to ANOVA

9.5 Analysis of covariance

9.6 Two-way ANOVA

9.7 Repeated-measures design

9.8 Intraclass correlation—measuring agreement

9.9 Power analysis with ANOVA

9.11 Exercises

9.2 ANOVA example

9.3 ANOVA example using survey data

9.4 A nonparametric alternative to ANOVA

9.5 Analysis of covariance

9.6 Two-way ANOVA

9.7 Repeated-measures design

9.8 Intraclass correlation—measuring agreement

9.9 Power analysis with ANOVA

9.9.1 One-way ANOVA

Power analysis for two-way ANOVA

9.9.2 Power analysis for repeated-measures ANOVA

9.9.3 Summary of power analysis for ANOVA

9.10 SummaryPower analysis for two-way ANOVA

9.9.2 Power analysis for repeated-measures ANOVA

9.9.3 Summary of power analysis for ANOVA

9.11 Exercises

10 Multiple regression

10.1 Introduction to multiple regression

10.2 What is multiple regression?

10.3 The basic multiple regression command

10.4 Increment in R-squared: Semipartial correlations

10.5 Is the dependent variable normally distributed?

10.6 Are the residuals normally distributed?

10.7 Regression diagnostic statistics

10.9 Categorical predictors and hierarchical regression

10.10 A shortcut for working with a categorical variable

10.11 Fundamentals of interaction

10.12 Nonlinear relations

10.14 Summary

10.15 Exercises

10.2 What is multiple regression?

10.3 The basic multiple regression command

10.4 Increment in R-squared: Semipartial correlations

10.5 Is the dependent variable normally distributed?

10.6 Are the residuals normally distributed?

10.7 Regression diagnostic statistics

10.7.1 Outliers and influential cases

10.7.2 Influential observations: DFbeta

10.7.3 Combinations of variables may cause problems

10.8 Weighted data10.7.2 Influential observations: DFbeta

10.7.3 Combinations of variables may cause problems

10.9 Categorical predictors and hierarchical regression

10.10 A shortcut for working with a categorical variable

10.11 Fundamentals of interaction

10.12 Nonlinear relations

10.12.1 Fitting a quadratic model

10.12.2 Centering when using a quadratic term

10.12.3 Do we need to add a quadratic component?

10.13 Power analysis in multiple regression10.12.2 Centering when using a quadratic term

10.12.3 Do we need to add a quadratic component?

10.14 Summary

10.15 Exercises

11 Logistic regression

11.1 Introduction to logistic regression

11.2 An example

11.3 What is an odds ratio and a logit?

11.5 Logistic regression

11.6 Hypothesis testing

11.8 Nested logistic regressions

11.9 Power analysis when doing logistic regression

11.10 Summary

11.11 Exercises

11.2 An example

11.3 What is an odds ratio and a logit?

11.3.1 The odds ratio

11.3.2 The logit transformation

11.4 Data used in rest of chapter11.3.2 The logit transformation

11.5 Logistic regression

11.6 Hypothesis testing

11.6.1 Testing individual coefficients

11.6.2 Testing sets of coefficients

11.7 More on interpreting results from logistic regression11.6.2 Testing sets of coefficients

11.8 Nested logistic regressions

11.9 Power analysis when doing logistic regression

11.10 Summary

11.11 Exercises

12 Measurement, reliability, and validity

12.1 Overview of reliability and validity

12.2 Constructing a scale

12.6 PCF analysis

12.9 Exercises

12.2 Constructing a scale

12.2.1 Generating a mean score for each person

12.3 Reliability
12.3.1 Stability and test–retest reliability

12.3.2 Equivalence

12.3.3 Split-half and alpha reliability—internal consistency

12.3.4 Kuder–Richardson reliability for dichotomous items

12.3.5 Rater agreement—kappa (*K*)

12.4 Validity 12.3.2 Equivalence

12.3.3 Split-half and alpha reliability—internal consistency

12.3.4 Kuder–Richardson reliability for dichotomous items

12.3.5 Rater agreement—kappa (

12.4.1 Expert judgment

12.4.2 Criterion-related validity

12.4.3 Construct validity

12.5 Factor analysis 12.4.2 Criterion-related validity

12.4.3 Construct validity

12.6 PCF analysis

12.6.1 Orthogonal rotation: Varimax

12.6.2 Oblique rotation: Promax

12.7 But we wanted one scale, not four scales 12.6.2 Oblique rotation: Promax

12.7.1 Scoring our variable

12.8 Summary 12.9 Exercises

13 Working with missing values—multiple imputation

13.1 The nature of the problem

13.2 Multiple imputation and its assumptions about the mechanism for missingness

13.3 What variables do we include when doing imputations?

13.4 Multiple imputation

13.5 A detailed example

13.7 Exercises

13.2 Multiple imputation and its assumptions about the mechanism for missingness

13.3 What variables do we include when doing imputations?

13.4 Multiple imputation

13.5 A detailed example

13.5.1 Preliminary analysis

13.5.2 Setup and multiple-imputation stage

13.5.3 The analysis stage

13.5.4 For those who want an*R*^{2} and standardized *β*s

13.5.5 When impossible values are imputed

13.6 Summary13.5.2 Setup and multiple-imputation stage

13.5.3 The analysis stage

13.5.4 For those who want an

13.5.5 When impossible values are imputed

13.7 Exercises

14 The sem and gsem commands

14.1 Ordinary least-squares regression models using sem

14.5 Conclusions and what is next for the sem command

14.6 Exercises

14.1.1 Using the SEM Builder to fit a basic regression model

14.2 A quick way to draw a regression model and a fresh start
14.2.1 Using sem without the SEM Builder

14.3 WThe gsem command for logistic regression
14.3.1 Fitting the model using the logit command

14.3.2 Fitting the model using the gsem command

14.4 Path analysis and mediation14.3.2 Fitting the model using the gsem command

14.5 Conclusions and what is next for the sem command

14.6 Exercises

A What’s next?

A.1 Introduction to the appendix

A.2 Resources

A.2 Resources

A.2.1 Web resources

A.2.2 Books about Stata

A.2.3 Short courses

A.2.4 Acquiring data

A.3 SummaryA.2.2 Books about Stata

A.2.3 Short courses

A.2.4 Acquiring data

References

Author index (pdf)

Subject index(pdf)