{smcl} {* Last revised April 18, 2006}{...} {hline} help for {hi:gologit2} {hline} {title:Generalized Ordered Logit Models for Ordinal Dependent Variables} {p 8 15 2} {cmdab:gologit2} {it:depvar} [{it:indepvars}] [{it:weight}] [{cmd:if} {it:exp}] [{cmd:in} {it:range}] [{cmd:,} {cmdab:p:l} {cmdab:p:l(}{it:varlist}{cmd:)} {cmdab:np:l} {cmdab:np:l(}{it:varlist}{cmd:)} {cmdab:auto:fit} {cmdab:auto:fit(}{it:alpha}{cmd:)} {cmd:link(}{it:logit/probit/cloglog/loglog/cauchit}{cmd:)} {cmdab:force} {cmdab:lrf:orce} {cmdab:g:amma} {cmdab:g:amma(}{it:name}{cmd:)} {cmdab:nol:abel} {cmdab:sto:re(}{it:name}{cmd:)} {cmdab:c:onstraints:(}{it:clist}{cmd:)} {cmdab:r:obust} {cmdab:cl:uster:(}{it:varname}{cmd:)} {cmdab:l:evel:(}{it:#}{cmd:)} {cmdab:or} {cmdab:log} {cmdab:v1} {cmd:svy} {it:svy_options} {it:maximize_options} ] {p 4 4 2} where {it:svy_options} are {p 8 8 2} {cmdab:sub:pop:(}{it:subpop_spec}{cmd:)} {cmdab:nosvy:adjust} {cmdab:pr:ob} {cmd:ci} {cmd:deff} {cmd:deft} {cmd:meff} {cmd:meft} {p 4 8 2} and {it:subpop_spec} is {p 12 12 2} [{it:varname}] [{cmd:if} {it:exp}] [{cmd:in} {it:range}] [, {cmdab:srs:subpop} ] {p 4 4 2} {cmd:gologit2} shares the features of all estimation commands; see help {help est}. {cmd:gologit2} typed without arguments redisplays previous results. The following options may be given when redisplaying results: {p 8 8 2} {cmdab:g:amma} {cmdab:g:amma(}{it:name}{cmd:)} {cmdab:sto:re(}{it:name}{cmd:)} {cmd:or} {cmdab:l:evel:(}{it:#}{cmd:)} {cmdab:pr:ob} {cmd:ci} {cmd:deff} {cmd:deft} {p 4 4 2} {cmd:gologit2} works under both Stata 8.2 and Stata 9 or higher. If using Stata 9, the {opt by}, {opt nestreg}, {opt stepwise}, {opt xi}, and possibly other prefix commands are allowed; see {help prefix}. The {cmd:svy} prefix command is NOT currently supported; use the {cmd:svy} option instead. {p 4 4 2} {cmd:fweight}s, {cmd:iweight}s, and {cmd:pweight}s are allowed; see help {help weights}. {p 4 4 2} The syntax of {help predict} following {cmd:gologit2} is {p 8 16 2}{cmd:predict} [{it:type}] {it:newvarname}({it:s}) [{cmd:if} {it:exp}] [{cmd:in} {it:range}] [{cmd:,} {it:statistic} {cmdab:o:utcome:(}{it:outcome}{cmd:)} ] {p 4 4 2} where {it:statistic} is {p 8 21 2}{cmd:p}{space 8}probability (specify one new variable and {cmd:outcome()} option, or specify k new variables, k = # of outcomes); the default{p_end} {p 8 21 2}{cmd:xb}{space 7}linear prediction ({cmd:outcome()} option required){p_end} {p 8 21 2}{cmd:stdp}{space 5}S.E. of linear prediction ({cmd:outcome()} option required){p_end} {p 8 21 2}{cmd:stddp}{space 4}S.E. of difference in linear predictions ({cmd:outcome()} option is {cmd:outcome(}{it:outcome1}{cmd:,}{it:outcome2}{cmd:)}){p_end} {p 4 4 2} Note that you specify one new variable with {cmd:xb}, {cmd:stdp}, and {cmd:stddp} and specify either one or k new variables with {cmd:p}. {p 4 4 2} These statistics are available both in and out of sample; type "{cmd:predict} {it:...} {cmd:if e(sample)} {it:...}" if wanted only for the estimation sample. {title:Description} {p 4 4 2} {cmd:gologit2} is a user-written program that estimates generalized ordered logit models for ordinal dependent variables. The actual values taken on by the dependent variable are irrelevant except that larger values are assumed to correspond to "higher" outcomes. Up to 20 outcomes are allowed. {cmd:gologit2} is inspired by Vincent Fu's {cmd:gologit} program and is backward compatible with it but offers several additional powerful options. {p 4 4 2} A major strength of {cmd:gologit2} is that it can also estimate three special cases of the generalized model: the {it} proportional odds/parallel lines model{sf}, the {it}partial proportional odds model{sf}, and the {it}logistic regression model{sf}. Hence, {cmd:gologit2} can estimate models that are less restrictive than the proportional odds /parallel lines models estimated by {cmd:ologit} (whose assumptions are often violated) but more parsimonious and interpretable than those estimated by a non-ordinal method, such as multinomial logistic regression (i.e. {cmd:mlogit}). The {cmd:autofit} option greatly simplifies the process of identifying partial proportional odds models that fit the data. {p 4 4 2} An alternative but equivalent parameterization of the model that has appeared in the literature is reported when the {cmd:gamma} option is selected. Other key advantages of {cmd:gologit2} include support for linear constraints (making it possible to use {cmd:gologit2} for constrained logistic regression), survey data estimation, and the computation of estimated probabilities via the {cmd:predict} command. {p 4 4 2} Also, if the user considers them more appropriate for their data, probit, complementary log-log, log-log and cauchit links can be used instead of logit by specifying the {cmd:link} option, e.g. {opt link(l)} for logit (the default), {opt link(p)} for probit, {opt link(c)} for complementary log-log, {opt link(ll)} for log-log, and {opt link(ca)} for cauchit. {p 4 4 2} {cmd:gologit2} works under both Stata 8.2 and Stata 9 or higher. Syntax is the same for both versions; but if you are using Stata 9 or higher, gologit2 supports several prefix commands, including {cmd:by}, {cmd:nestreg}, {cmd:xi} and {cmd:sw}. Stata 9's {cmd:svy} prefix command is NOT currently supported; use the {cmd:svy} option instead. {p 4 4 2} More information on the statistical theory behind {cmd:gologit2} as well as several worked examples and a troubleshooting FAQ can be found at {browse "http://www.nd.edu/~rwilliam/gologit2/"}. {title:Warning & Error Messages} {p 4 4 2}Note: A trouble-shooting FAQ with additional information can be found at{break} {browse "http://www.nd.edu/~rwilliam/gologit2/tsfaq.html"} {p 4 4 2}An oddity of gologit/goprobit models is that it is possible to get negative predicted probabilities. McCullaph & Nelder discuss this in Generalized Linear Models, 2nd edition, 1989, p. 155: "The usefulness of non-parallel regression models is limited to some extent by the fact that the lines must eventually intersect. Negative fitted values are then unavoidable for some values of x, though perhaps not in the observed range. If such intersections occur in a sufficiently remote region of the x-space, this flaw in the model need not be serious." {p 4 4 2}This seems to be a fairly rare occurrence, and when it does occur there are often other problems with the model, e.g. the model is overly complicated and/or there are very small Ns for some categories of the dependent variable. Combining categories or simplifying the model often helps. {opt gologit2} will give a warning message whenever any in-sample predicted probabilities are negative. If it is just a few cases, it may not be worth worrying about, but if there are many cases you may wish to modify your model, data, or sample, or use a different statistical technique altogether. {title:Options} {p 4 8 2} {cmd:pl}, {cmd:npl}, {cmd:npl()}, {cmd:pl()}, {cmd:autofit} and {cmd:autofit()} provide alternative means for imposing or relaxing the proportional odds/ parallel lines assumption. Only one may be specified at a time. {p 8 12 2} {cmd:autofit(}{it:alpha}{cmd:)} uses an iterative process to identify the partial proportional odds model that best fits the data. {it:alpha} is the desired significance level for the tests; {it:alpha} must be greater than 0 and less than 1. If {cmd:autofit} is specified without parameters, the default alpha-value is .05. Note that, the higher {it:alpha} is, the easier it is to reject the parallel lines assumption, and the less parsimonious the model will tend to be. This option can take a little while because several models may need to be estimated. The use of {cmd:autofit} is highly recommended but the other options provide more control over the final model if the user wants it. {p 8 12 2} {cmd:pl} specified without parameters constrains all independent variables to meet the proportional odds/ parallel lines assumption. It will produce results that are equivalent to {cmd:ologit}. {p 8 12 2} {cmd:npl} specified without parameters relaxes the proportional odds/ parallel lines assumption for all explanatory variables. This is the default option and presents results equivalent to the original {cmd:gologit.} {p 8 12 2} {cmd:pl(}{it:varlist}{cmd:)} constrains the specified explanatory variables to meet the proportional odds/ parallel lines assumption. All other variable effects do not need to meet the assumption. The variables specified must be a subset of the explanatory variables. {p 8 12 2} {cmd:npl(}{it:varlist}{cmd:)} frees the specified explanatory variables from meeting the proportional odds/ parallel lines assumption. All other explanatory variables are constrained to meet the assumption. The variables specified must be a subset of the explanatory variables. {p 4 8 2} {cmd:link(}{it:logit/probit/cloglog/loglog/cauchit}{cmd:)} specifies the link function to be used. The legal values are {opt link(logit)}, {opt link(probit)}, {opt link(cloglog)}, {opt link(loglog)}, and {opt link(cauchit)}, which can be abbreviated as {opt link(l)}, {opt link(p)}, {opt link(c)}, {opt link(ll)} and {opt link(ca)}. {opt link(logit)} is the default if the option is omitted. {p 8 8 2} The following advice is adapted from Norusis (2005, p. 84): Probit and logit models are reasonable choices when the changes in the cumulative probabilities are gradual. If there are abrupt changes, other link functions should be used. The log-log link may be a good model when the cumulative probabilities increase from 0 fairly slowly and then rapidly approach 1. If the opposite is true, namely that the cumulative probability for lower scores is high and the approach to 1 is slow, the complementary log-log link may describe the data. The cauchit distribution has tails that are bigger than the normal distribution’s, hence the cauchit link may be useful when you have more extreme values in either direction. {p 8 8 2} NOTE: Programs differ in the names used for these latter two links. Stata's loglog link corresponds to SPSS PLUM's cloglog link; and Stata's cloglog link is called nloglog in SPSS. {p 8 8 2} NOTE: Post-estimation commands that work with this program may support some links but not others. Check the program documentation to be sure it works correctly with the link you are using. For example, post-estimation commands that work with the original {cmd:gologit} will often work with this program, but only if you are using the logit link. {p 4 8 2} {opt force} can be used to force {cmd:gologit2} to issue only warning messages in some situations when it would normally give a fatal error: {p 8 8 2} By default, the dependent variable can have a maximum of 20 categories. A variable with more categories than that is probably a mistaken entry by the user, e.g. a continuous variable has been specified rather than an ordinal one. But, if your dependent variable really is ordinal with more than 20 categories, {opt force} will let {cmd:gologit2} analyze it (although other practical limitations, such as small sample sizes within categories, may keep it from coming up with a final solution.) {p 8 8 2} Also, variables specified in {opt npl(varlist)} and {opt pl(varlist)} are required to be a subset of the explanatory variables. However, prefix commands like {opt sw} and {opt nestreg} estimate models that may use only a subset of the X variables, and those subsets may not include all the variable specified in the npl/pl varlists. Using {opt force} will allow model estimation to continue in those cases. {p 8 8 2} Obviously, you should only use {opt force} when you are confident that you are not making a mistake. {p 4 8 2} {cmd:lrforce} forces Stata to report a Likelihood Ratio Statistic under certain conditions when it ordinarily would not. Some types of constraints can make a Likelihood Ratio chi-square test invalid. Hence, to be safe, Stata reports a Wald statistic whenever constraints are used. But, Likelihood Ratio statistics should be correct for the types of constraints imposed by the {cmd:pl} and {cmd:npl} commands. Note that the {cmd:lrforce} option will be ignored when robust standard errors are specified either directly or indirectly, e.g. via use of the {cmd:robust} or {cmd:svy} options. Use this option with caution if you specify other constraints since these may make a LR chi- square statistic inappropriate. {p 4 8 2} {cmd:store(}{it:name}{cmd:)} causes the command {cmd:estimates store {it:name}} to be executed when {cmd:gologit2} finishes. This is useful for when you wish to estimate a series of models and want to save the results. See help {help estimates}. {p 4 8 2} {cmd:gamma} displays an alternative but equivalent parameterization of the partial proportional odds model used by Peterson and Harrell (1990) and Lall et al (2002). Under this parameterization, there is one Beta coefficient and M-2 Gamma coefficients for each explanatory variable, where M = the number of categories for Y. The gammas indicate the extent to which the proportional odds assumption is violated by the variable, i.e. when the gammas do not significantly differ from 0 the proportional odds assumption is met. Advantages of this parameterization include the fact that it is more parsimonious than the default layout. In addition, by examining the test statistics for the Gammas, you can get a feel for which variables meet the proportionality assumption and which do not. {p 4 8 2} {opt gamma(name)} causes the gamma estimates to be stored as {it:name}, e.g. {opt g(gamma1)} would store the gamma estimates under the name {it:gamma1}. This makes the gamma results easily usable with post- estimation table formatting commands like {cmd:outreg2} and {cmd:estout}. Do NOT try to make the gamma results active and then use other post-estimation commands, e.g. {cmd:predict} or {cmd:test}. Such commands either will not work or, if they do work, may give incorrect results. Note that only the variances and standard errors of the gamma estimates are correct; all the covariances of the estimates are set equal to zero. {p 4 8 2} {cmd:nolabel} causes the equations to be named eq1, eq2, etc. The default is to use the first 32 characters of the value labels and/or the values of Y as the equation labels. Note that some characters cannot be used in equation names, e.g. the space ( ), the period (.), the dollar sign ($), and the colon(:), and will be replaced with the underscore (_) character. The default behavior works well when the value labels are short and descriptive. It may not work well when value labels are very long and/or include characters that have to be changed to underscores. If the printout looks unattractive and/or you are getting strange errors, try changing the value labels of Y or else use the {cmd:nolabel} option. {p 4 8 2} {cmd:v1} causes {cmd:gologit2} to return results in a format that is consistent with {cmd: gologit 1.0}. This may be useful/ necessary for post-estimation commands that were written specifically for {cmd:gologit}. However, post-estimation commands written for {cmd:gologit2} may not work correctly if {cmd:v1} is specified. The {cmd:v1} option only works when using {cmd:link(logit)}. {p 4 8 2} {cmd:log} displays the iteration log. By default it is suppressed. {p 4 8 2} {cmd:or} reports the estimated coefficients transformed to relative odds ratios, i.e., exp(b) rather than b; see {hi:[R] ologit} for a description of this concept. Options {cmd:rrr}, {cmd:eform}, {cmd:irr} and {cmd:hr} produce identical results (labeled differently) and can also be used. It is up to the user to decide whether the exp(b) transformation makes sense given the link function used, e.g. it probably doesn't make sense when using the probit link. {p 4 8 2} {cmd:constraints(}{it:clist}{cmd:)} specifies the linear constraints to be applied during estimation. The default is to perform unconstrained estimation. Constraints are defined with the { help constraint} command. {cmd:constraints(1)} specifies that the model is to be constrained according to constraint 1; {cmd:constraints(1-4)} specifies constraints 1 through 4; {cmd:constraints(1-4,8)} specifies 1 through 4 and 8. Keep in mind that the {cmd:pl}, {cmd:npl}, and {cmd:autofit} options work by generating across-equation constraints, which may affect how any additional constraints should be specified. When using the {cmd:constraint} command, it is usually easiest and safest to refer to equations by their equation #, e.g. #1, #2, etc. {p 4 8 2} {cmd:robust} specifies that the Huber/White/sandwich estimator of variance is to be used in place of the traditional calculation. {cmd:robust} combined with {cmd:cluster()} allows observations which are not independent within cluster (although they must be independent between clusters). If you specify {cmd:pweight}s, {cmd:robust} is implied. {p 4 8 2} {cmd:cluster(}{it:varname}{cmd:)} specifies that the observations are independent across groups (clusters) but not necessarily within groups. {it:varname} specifies to which group each observation belongs; e.g., {cmd:cluster(personid)} in data with repeated observations on individuals. {cmd:cluster()} affects the estimated standard errors and variance-covariance matrix of the estimators (VCE), but not the estimated coefficients. {cmd:cluster()} can be used with {cmd:pweight}s to produce estimates for unstratified cluster-sampled data. {p 4 8 2} {cmd:level(}{it:#}{cmd:)} specifies the confidence level in percent for the confidence intervals of the coefficients; see help {help level}. {p 4 8 2} {it:maximize_options} control the maximization process; see help {help maximize}. You should never have to specify most of these. However, the {opt difficult} option can sometimes be useful with models that are running very slowly or not converging at all. {title:Additional Options for Survey Data Estimation} {p 4 8 2} Stata 9's {cmd:svy} prefix command is not currently supported. The {cmd:svy} option accomplishes most of the same things. {cmd:svy} indicates that {cmd:gologit2} is to pick up the {cmd:svy} settings set by {cmd:svyset} and use the robust variance estimator. Thus, this option requires the data to be {cmd:svyset}; see help {help svyset}. {cmd:svy} may not be supplied with {it:weight}s or the {cmd:strata()}, {cmd:psu()}, {cmd:fpc()}, or {cmd:cluster()} options. When using svy estimation, Use of {cmd:if} or {cmd:in} restrictions will not produce correct variance estimates for subpopulations in many cases. To compute estimates for subpopulations, use the {cmd:subpop()} option. A typical command using the svy option would look something like {p 8 12 2}{cmd:. gologit2 health female black age age2, autofit svy} {p_end} {p 4 8 2} The following options are available when the {cmd:svy} option has been specified. If {cmd:svy} has not been specified, use of these options will produce an error. {p 8 12 2} {cmd:subpop(}{it:subpop_spec}{cmd:)} specifies that estimates be computed for the single subpopulation identified in {it:subpop_spec}. {it:subpop_spec} is {p 16 16 2} [{it:varname}] [{cmd:if} {it:exp}] [{cmd:in} {it:range}] [, {cmdab:srs:subpop} ] {p 12 12 2} Thus the subpopulation is defined by the observations for which {it:varname}!=0 that also meet the {cmd:if} and {cmd:in} conditions. Typically, {it:varname}=1 defines the subpopulation and {it:varname}=0 indicates observations not belonging to the subpopulation. For observations whose subpopulation status is uncertain, {it:varname} should be set to missing. {p 12 16 2} {cmd:srssubpop} requests that deff and deft be computed using an estimate of simple-random-sampling variance for sampling within a subpopulation. If {cmd:srssubpop} is not specified, deff and deft are computed using an estimate of simple-random-sampling variance for sampling from the entire population. Typically, {cmd:srssubpop} would be given when computing subpopulation estimates by strata or by groups of strata. {p 8 12 2} {cmd:nosvyadjust} specifies that the model Wald test be carried out as W/k distributed F(k,d), where W is the Wald test statistic, k is the number of terms in the model excluding the constant, d = total number of sampled PSUs minus total number of strata, and F(k,d) is an F distribution. By default, an adjusted Wald test is conducted: (d-k+1)W/(kd) distributed F(k,d-k+1). Use of the {cmd:nosvyadjust} option is not recommended. {p 8 12 2} {cmd:prob} requests that the t statistic and p-value be displayed. The degrees of freedom for the t are d = total number of sampled PSUs minus the total number of strata (regardless of the number of terms in the model). If no display options are specified then, by default, the t statistic and p-value are displayed. {p 8 12 2} {cmd:ci} requests that confidence intervals be displayed. If no display options are specified then, by default, confidence intervals are displayed. {p 8 12 2} {cmd:deff} requests that the design-effect measure deff be displayed. {p 8 12 2} {cmd:deft} requests that the design-effect measure deft be displayed. {p 8 12 2} {cmd:meff} requests that the meff measure of misspecification effects be displayed. This option must be specified at the time of the initial estimation. {p 8 12 2} {cmd:meft} requests that the meft measure of misspecification effects be displayed. This option must be specified at the time of the initial estimation. {title:Options for predict} {p 4 8 2} {cmd:p}, the default, calculates predicted probabilities. {p 8 8 2} If you do not also specify the {cmd:outcome()} option, you must specify k new variables. For instance, say you fitted your model by typing "{cmd:gologit2 insure age male}" and that {cmd:insure} takes on three values. Then you could type "{cmd:predict p1 p2 p3, p}" to obtain all three predicted probabilities. {p 8 8 2} If you also specify the {cmd:outcome()} option, then you specify one new variable. Say that {hi:insure} took on values 1, 2, and 3. Then typing "{cmd:predict p1, p outcome(1)}" would produce the same {hi:p1} as above, "{cmd:predict p2, p outcome(2)}" the same {hi:p2} as above, etc. If {hi:insure} took on values 7, 22, and 93, you would specify {cmd:outcome(7)}, {cmd:outcome(22)}, and {cmd:outcome(93)}. Alternatively, you could specify the outcomes by referring to the equation number ({cmd:outcome(#1)}, {cmd:outcome(#2)}, and {cmd:outcome(#3)}, or, if the variable values are labeled, you can do something like {cmd:outcome(low)}, {cmd:outcome(medium)}, and {cmd:outcome(high)}. {p 4 8 2} {cmd:xb} calculates the linear prediction. You must also specify the {cmd:outcome()} option. {p 4 8 2} {cmd:outcome()} specifies for which outcome the statistic is to be calculated. {cmd:equation()} is a synonym for {cmd:outcome()}: it does not matter which you use. {cmd:outcome()} and {cmd:equation()} can be specified using (1) {cmd:#1}, {cmd:#2}, ..., with {cmd:#1} meaning the first category of the dependent variable, {cmd:#2} the second category, etc.; (2) values of the dependent variable; or (3) the value labels (if any) of the dependent variable. {p 4 8 2} {cmd:stdp} calculates the standard error of the linear prediction. You must also specify the {cmd:outcome()} option. {p 4 8 2} {cmd:stddp} calculates the standard error of the difference in two linear predictions. You must specify option {cmd:outcome()} and in this case you specify the two particular outcomes of interest inside the parentheses; for example, "{cmd:predict sed, stdp outcome(1,3)}". {title:Some cautions about using {opt gologit2} with the Stata 9 prefix commands} {p 4 4 2} In general, {opt gologit2} seems to work well with Stata 9's prefix commands, e.g. {opt xi}, {opt nestreg}. There are, however, some combinations of prefix commands and {cmd:gologit2} options that can be problematic that users should be aware of. {p 4 4 2} The {opt sw} and {opt nestreg} prefix commands should work fine with the {opt pl} and {opt npl} options. However, they may produce error messages and/or unexpected results when combined with the {opt autofit}, {opt autofit(alpha)}, {opt npl(varlist)} or {opt pl(varlist)} options. For example, the submodels estimated by {opt nestreg} may not include all the variables specified in {opt pl(varlist)}, resulting in a fatal error message. You can override this error by using the {opt force} option. {opt sw} and {opt autofit} would be an especially questionable combination since both stepwise selection of variables and stepwise selection of constraints would be going on. {p 4 4 2} Other than the above, {opt gologit2} will hopefully work fine in most common situations where prefix commands are used. However, there are many possible esoteric combinations of prefix commands and {opt gologit2} options and {opt gologit2} is not guaranteed to be problem-free with all of them. {title:Examples} {p 4 4 2} {it}Example 1. Proportional Odds/ Parallel Lines Assumption Violated.{sf} Long and Freese (2003) present data from the 1977/ 1989 General Social Survey. Respondents are asked to evaluate the following statement: "A working mother can establish just as warm and secure a relationship with her child as a mother who does not work." Responses were coded as 1 = Strongly Disagree (1SD), 2 = Disagree (2D), 3 = Agree (3A), and 4 = Strongly Agree (4SA). We can do a global test of the proportional odds assumption by estimating a model in which no variables are constrained to meet the assumption and contrasting it with a model in which all are so constrained (the latter is equivalent to the {cmd:ologit} model). {p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end} {p 8 12 2}{cmd:. gologit2 warm yr89 male white age ed prst, store(unconstrained)}{p_end} {p 8 12 2}{cmd:. gologit2 warm yr89 male white age ed prst, pl lrf store(constrained)} {p_end} {p 8 12 2}{cmd:. lrtest constrained unconstrained}{p_end} {p 4 4 2}The LR chi-square from the above is 49.20. These data fail the global test. {p 4 4 2} However, we can now use the {cmd:autofit} option to see whether a partial proportional odds model can fit the data. In a partial proportional odds model, some variables meet the proportional odds assumption while others do not. {p 8 12 2}{cmd:. gologit2 warm yr89 male white age ed prst, autofit}{p_end} {p 4 4 2} The results show that 4 of the 6 variables (white, age, ed, prst) meet the parallel lines assumption. Only yr89 and male do not. This model is less restrictive than a model estimated by {cmd:ologit} would be (whose assumptions are violated in this case) but much more parsimonious than a non-ordinal alternative such as {cmd:mlogit}. {p 4 4 2} {it}Example 2: Alternative parameterization using the {cmd:gamma} option.{sf} Peterson & Harrell (1990) and Lall et al (2002) present an alternative parameterization of the partial proportional odds model. In this parameterization, gamma coefficients represent deviations from proportionality. When the gammas of an explanatory variable do not significantly differ from 0, the parallel lines assumption for that variable is met. Using the {cmd:autofit} and {cmd:gamma} options, we can (a) confirm that Lall came up with the correct partial proportional odds model, and (b) replicate the results from his Table 5. {p 8 12 2}{cmd:. use http://www.nd.edu/~rwilliam/gologit2/lall.dta}{p_end} {p 8 12 2}{cmd:. gologit2 hstatus heart smoke, lrf gamma autofit}{p_end} {p 4 4 2} {it}Example 3: Survey data estimation.{sf} By using the {cmd:svy} option, we can estimate models with data that have been svyset. {p 8 12 2}{cmd:. use http://www.stata-press.com/data/r8/nhanes2f.dta}{p_end} {p 8 12 2}{cmd:. gologit2 health female black age age2, svy autofit}{p_end} {p 8 12 2}{cmd:. gologit2 health black age age2, svy autofit subpop(female)}{p_end} {p 4 4 2} {it}Example 4. {cmd:gologit 1.0} compatibility.{sf} Some post-estimation commands - specifically, the {cmd:spost} routines of Long and Freese - currently work with the original {cmd:gologit} but not {cmd:gologit2}. That should change in the future. For now, you can use the {cmd:v1} parameter to make the stored results from {cmd:gologit2} compatible with {cmd:gologit 1.0}. (Note, however, that this may make the results non-compatible with post-estimation routines written for {cmd:gologit2}; also, you have to be using the dafualt logit link.) Using the working mother's data again, {p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end} {p 8 12 2}{cmd:. * Use the v1 option to save internally stored results in gologit 1.0 format}{p_end} {p 8 12 2}{cmd:. quietly gologit2 warm yr89 male white age ed prst, pl(yr89 male) lrf v1}{p_end} {p 8 12 2}{cmd:. * Use one of Long & Freese's spost routines}{p_end} {p 8 12 2}{cmd:. prvalue, x(male=0 yr89=1 age=30) rest(mean)}{p_end} {p 4 4 2} {it}Example 5. The {cmd:predict} command. {sf} In addition to the standard options ({cmd:xb, stdp, stddp}) the {cmd:predict} command supports the {cmd:pr} option (abbreviated {cmd:p}) for predicted probabilities; {cmd:pr} is the default option if nothing else is specified. For example, {p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end} {p 8 12 2}{cmd:. quietly gologit2 warm yr89 male white age ed prst, pl(yr89 male) lrf}{p_end} {p 8 12 2}{cmd:. predict p1 p2 p3 p4}{p_end} {p 4 4 2} {it}Example 6. Constrained logistic regression. {sf} The models estimated by Stata's {cmd:logit} and {cmd:ologit} commands are special cases of the gologit model; but neither of these commands currently supports the use of linear constraints, such as two variables having equal effects. {cmd:gologit2} can be used for this purpose. For example, {p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end} {p 8 12 2}{cmd:. recode warm (1 2 = 0)(3 4 = 1), gen(agree)}{p_end} {p 8 12 2}{cmd:. * Constrain the effects of male and white to be equal}{p_end} {p 8 12 2}{cmd:. constraint 1 male = white}{p_end} {p 8 12 2}{cmd:. gologit2 agree yr89 male white age ed prst, lrf store(constrained) c(1)}{p_end} {p 4 4 2} {it}Example 7. Other link functions. {sf} By default, and as its name implies. {cmd:gologit2} uses the logit link. If you prefer, however, you can specify probit, complementary log log, log log, or cauchit links. For example, to estimate a goprobit model, {p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end} {p 8 12 2}{cmd:. gologit2 warm yr89 male white age ed prst, link(p)}{p_end} {p 4 4 2} {it}Example 8. Prefix commands. {sf} If you are using Stata 9 or higher, {cmd:gologit2} supports many of the prefix commands. For example, {p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end} {p 8 12 2}{cmd:. sw, pe(.05): gologit2 warm yr89 male}{p_end} {p 8 12 2}{cmd:. xi: gologit2 warm yr89 i.male}{p_end} {p 8 12 2}{cmd:. nestreg: gologit2 warm (yr89 male white age) (ed prst)}{p_end} {p 4 4 2} {it}Example 9. Post-estimation table formatting commands. {sf} Here is an example of how you could use {cmd:outreg2} to format the results from multiple models. In this example I use both the regular and the gamma results but most people would probably choose one or the other. The store option stores the regular results while the g option stores the results from the gamma parameterization. {p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end} {p 8 12 2}{cmd:. * Unconstrained model}{p_end} {p 8 12 2}{cmd:. gologit2 warm yr89 male white age ed prst, npl lrf store(gologit) g(gologit_g)}{p_end} {p 8 12 2}{cmd:. * Autofit model}{p_end} {p 8 12 2}{cmd:. gologit2 warm yr89 male white age ed prst, autofit lrf store(gologit2) g(gologit2_g)}{p_end} {p 8 12 2}{cmd:. * Use outreg2 to output the regular results in a single table}{p_end} {p 8 12 2}{cmd:. outreg2 [gologit gologit2] using regular, replace onecol long nor2 seeout}{p_end} {p 8 12 2}{cmd:. * Use outreg2 to output the gamma results in a single table}{p_end} {p 8 12 2}{cmd:. outreg2 [gologit_g gologit2_g] using gamma, replace onecol long nor2 seeout}{p_end} {title:Author} {p 5 5} Richard Williams{break} Notre Dame Department of Sociology{break} Richard.A.Williams.5@ND.Edu{break} {browse "http://www.nd.edu/~rwilliam/gologit2/"}{p_end} {title:Acknowledgements} {p 5 5} Vincent Kang Fu of the Utah Department of Sociology wrote {cmd:gologit 1.0} and graciously allowed Richard Williams to incorporate parts of its source code and documentation in {cmd:gologit2}. {p 5 5}The documentation for Stata 8.2's {cmd:mlogit} command and the program {cmd:mlogit_p} were major aids in developing the {cmd:gologit2} documentation and in adding support for the {cmd:predict} command. Much of the code is adapted from {it}Maximum Likelihood Estimation with Stata, Second Edition{sf}, by William Gould, Jeffrey Pitblado and William Sribney. {p 5 5} Sarah Mustillo, Dan Powers, J. Scott Long, Nick Cox, Kit Baum and Joseph Hilbe provided stimulating and helpful comments. Jeff Pitblado was extremely helpful in updating the program to use Stata 9's new features. {title:References} {p 5 5}Fu, Vincent. 1998. "Estimating Generalized Ordered Logit Models." Stata Technical Bulletin 8:160-164. {p 5 5}Lall, R., S.J. Walters, K. Morgan, and MRC CFAS Co-operative Institute of Public Health. 2002. "A Review of Ordinal Regression Models Applied on Health-Related Quality of Life Assessments." Statistical Methods in Medical Research 11:49-67. {p 5 5}Long, J. Scott and Jeremy Freese. 2006. "Regression Models for Categorical Dependent Variables Using Stata, 2nd Edition." College Station, Texas: Stata Press. {p 5 5}Norusis, Marija. 2005. "SPSS 13.0 Advanced Statistical Procedures Companion." Upper Saddle River, New Jersey: Prentice Hall. {p 5 5}Peterson, Bercedis and Frank E. Harrell Jr. 1990. "Partial Proportional Odds Models for Ordinal Response Variables." Applied Statistics 39(2):205-217. {title:Suggested citation if using {cmd:gologit2} in published work } {p 5 5}{cmd:gologit2} is not an official Stata command. It is a free contribution to the research community, like a paper. Please cite it as such. {p 5 5}Williams, Richard. 2006. "Generalized Ordered Logit/ Partial Proportional Odds Models for Ordinal Dependent Variables." The Stata Journal 6(1):58-82. A pre-publication version is available at {break} ({browse "http://www.nd.edu/~rwilliam/gologit2/gologit2.pdf"}). {p 5 5} The above document provides more detailed explanations and examples and is recommended reading. {title:Also see} {p 4 13 2} Online: help for {help estcom}, {help postest}, {help constraint}, {help ologit}, {help svy}, {help svyologit}