The Workflow of Data Analysis Using Stata

Click to enlarge
See the back cover

Inside preview

Print eBook Kindle

$56.00 Print

Buy now

What are VitalSource eBooks?
Your access code will be emailed upon purchase.

$46.00 VitalSource

Buy now

$43.00 Amazon Kindle

Buy from Amazon
As an Amazon Associate, StataCorp earns a small referral credit from qualifying purchases made from affiliate links on our site.
Amazon Associate affiliate link

J. Scott Long
Publisher: Stata Press
Copyright: 2009
ISBN-13: 978-1-59718-047-4
Pages: 379; paperback
Price: $56.00
J. Scott Long
Publisher: Stata Press
Copyright: 2009
ISBN-13: 978-1-59718-210-2
Pages: 379; eBook
Price: $46.00
J. Scott Long
Publisher: Stata Press
Copyright: 2009
ISBN-13: 978-1-59718-211-9
Pages: 379; Kindle
Price: $43.00
Author index
Subject index
Download the datasets used in this book

Review from the Stata Journal
Long awarded Leamer-Rosenthal Prize
Chinese translation available

Comment from the Stata technical group

The Workflow of Data Analysis Using Stata, by J. Scott Long, is an essential productivity tool for data analysts. Aimed at anyone who analyzes data, this book presents an effective strategy for designing and doing data-analytic projects.

In this book, Long presents lessons gained from his experience with numerous academic publications, as a coauthor of the immensely popular Regression Models for Categorical Dependent Variables Using Stata, and as a coauthor of the SPOST routines, which are downloaded over 20,000 times a year.

A workflow of data analysis is a process for managing all aspects of data analysis. Planning, documenting, and organizing your work; cleaning the data; creating, renaming, and verifying variables; performing and presenting statistical analyses; producing replicable results; and archiving what you have done are all integral parts of your workflow.

Long shows how to design and implement efficient workflows for both one-person projects and team projects. Long guides you toward streamlining your workflow, because a good workflow is essential for replicating your work, and replication is essential for good science.

An efficient workflow reduces the time you spend doing data management and lets you produce datasets that are easier to analyze. When you methodically clean your data and carefully choose names and effective labels for your variables, the time you spend doing statistical and graphical analyses will be more productive and more enjoyable.

After introducing workflows and explaining how a better workflow can make it easier to work with data, Long describes planning, organizing, and documenting your work. He then introduces how to write and debug Stata do-files and how to use local and global macros. Long presents conventions that greatly simplify data analysis—conventions for naming, labeling, documenting, and verifying variables. He also covers cleaning, analyzing, and protecting your data.

While describing effective workflows, Long also introduces the concepts of basic data management using Stata and writing Stata do-files. Using real-world examples, Stata commands, and Stata scripts, Long illustrates effective techniques for managing your data and analyses. If you analyze data, this book is recommended for you.

Comments from readers

You have written the book that I had planned to write someday. But I’m glad I didn’t—your book is much better. Congratulations, this was greatly needed.

Prof. Bill Gardner
The Ohio State University

I will post the announcement of Workflow on my door with the following note: “I’m glad to help anybody who followed at least 25% of the advice Long provides—and brings me their do-files!”

Prof. Alan C. Acock
Oregon State University

I just wanted to send you a thank you for taking the time to write this book. I feel a little like an obsessed fan because I read it for several hours last night, bought 3 copies for my new research team and am presenting our new organization scheme tomorrow. It turns out that we have just finished a first flurry of data collection and hiring and I’ve been scratching my head about how to systematize some aspects. It is a perfect time to superimpose a structure. I’ve used aspects of your plan in my own work (hence my eagerness to adopt) but having this coherent volume is a wonderful and practical resource. I learned a lot from reading this. Thank you!

Elizabeth Gifford, Ph.D.
Research Scientist
Duke University

I just received a knock at my door with my new copy of The Workflow of Data Analysis Using Stata. I immediately ripped off the packaging and began perusing it. Just before the knock, I was attempting to write a program to get Stata to save the r(mean) and r(sd) for two variables following a summarize command to be saved for a ttesti command. After looking at your book for about two minutes, I stumbled upon pages 91–92, where it gave me all the information I need. … I have only had the book about 10 minutes and already it has made my life easier. Thanks much, and I am already looking forward to reading the rest of the book!

Claire M. Kamp Dush, Ph.D.
The Ohio State University

I am a Spanish professor of public economics who is at present enjoying a study-research leave at Melbourne University (Australia). Because of that I have had the time to read your book from cover to cover. I just want to thank you for the incredible work you have done! A book such as this one is a must for anyone trying to make an academic career. Definitely, I will recommend it to my graduate students as soon as I go back to Spain. If I had the chance to reach this book twenty years ago I would have been much more efficient doing my work. Never is it too late! Thanks!

Prof. Jose Felix Sanz-Sanz
Dept. of Applied Economics
Universidad Complutense de Madrid

About the author

J. Scott Long is Chancellor’s Professor of Sociology and Statistics and Associate Vice Provost for Research at Indiana University–Bloomington. He has contributed articles to many journals, including American Sociological Review, Social Forces, American Statistician, and Sociological Methods and Research. He was editor of Sociological Methods and Research from 1987 to 1994. Dr. Long has authored or edited seven previous books on statistics, including the highly acclaimed Regression Models for Categorical and Limited Dependent Variables. In 2001, he received the Paul Lazarsfeld Memorial Award for Distinguished Contributions to Sociological Methodology. Each summer at the University of Michigan, he teaches workshops at the Inter-University Consortium for Political and Social Research Summer Program in Quantitative Methods of Social Research.

Table of contents

View table of contents >>