Thirty years ago, the standard practice in forensic risk assessment of dangerousness and risk for sexual reoffence was unstructured professional/clinical judgement, relying upon the experience, theoretical orientation, and knowledge of the evaluator (Meehl, 1954; Monahan, 2007). Professional/Clinical judgement is subjective and idiosyncratic making the outcome unreliable and non-replicable. As research showed that unstructured evaluations were little better than chance (Dawes et al., 1989; Grove & Meehl, 1996; Grove et al., 2000; Hanson & Morton-Bourgon, 2009; Meehl, 1954; Menzies et al., 1994; Monahan, 2007; Quinsey & Ambtman, 1979; Steadman & Cocozza, 1974), dissatisfaction with subjective professional/clinical judgement led to important metanalytic studies and attempts to develop useful actuarially-based risk assessment instruments. This paper will present a brief overview of the development of actuarially-based risk assessment, focused on the assessment of dynamic risk factors and go on to propose many reasons why dynamic assessment methodologies should be used in long-stay incarceration.
A Developmental Progression
During the 1980s and 1990s several research teams focused on developing practical and valid actuarial risk assessment instruments focusing primarily on static variables. These included the Statistical Information on Recidivism scale (SIR, Nuffield, 1982); the Level of Supervision Inventory (LSI. Andrews, 1982) followed by the Level of Service Inventory - Revised (LSI-R, Andrews & Bonta, 1995), and the Level of Service Inventory – Ontario Version (LSI-OR, Girard, 1999), the Violence Prediction Scheme (VPS, Webster et al., 1994), the Risk Appraisal Guide (RAG, Harris et al., 1993), the Sex Offender Risk Appraisal Guide (SORAG, Quinsey et al., 1998), the Historical, Clinical, Risk Management (HCR-20, Webster et al., 1997), the Structured Anchored Clinical Judgement – Minimum (SACJ-Min, Grubin, 1998), the Sexual Violence Risk – 20 (SVR-20, Boer et al., 1997), the Violence Risk Scale (VRS, Wong & Gordon, 1999-2003), the Rapid Risk Assessment for Sexual Recidivism (RRASOR, Hanson, 1997), and the Static-99 (Hanson & Thornton, 1999).
No single variable has ever been shown to sufficiently predict sexual reoffence by itself (Kelley et al., 2020; Mann et al., 2010); hence, an agreed-upon, standardized list of risk-relevant items, relying upon their combined predictive power, is required. Lists of known or assumed risk factors for recidivism were statistically assessed; if a risk factor demonstrated reliable predictive ability it was retained in the model, if not, it was discarded. Drawbacks to this approach included that these scales rely on static historical factors, such as age and extent of criminal history, and cannot be used to measure change in risk over time. Also, this method of development is atheoretical and gives little insight into the causes of reoffending. Hence, actuarial instruments were developed that predict recidivism with greater reliability and validity than clinical judgement for both general psychiatric (Grove et al., 2000) and forensic samples (Andrews et al., 2006; Bonta et al., 1998; Hanson & Bussière, 1998; Mossman, 1994). While actuarial assessment greatly improved recidivism prediction we lacked a means of assessing whether there was any change in risk level over time; especially if the offender has received some form of treatment or supervisory intervention. To measure change in risk status it is necessary to assess changeable risk factors, evaluated across time, and add these factors to the overall evaluation of recidivism risk.
Concurrent with early scale development, Hanson and Bussière (1996, 1998) completed a meta-analytic review of the predictors of sexual recidivism. Meta-analysis is a series of statistical techniques that combine and weigh multiple studies to determine if there is a consensus, a general outcome, positive or negative, beyond that which could be attributed to random variation or sample bias. This meta-analysis reviewed 61 follow-up studies including information from 28,972 men convicted of sexual offences, identifying those factors most strongly correlated with sexual recidivism. It was noted that some of these factors did not change and these were labeled static factors, while other factors could be changed and these were labeled dynamic factors. Lessons learned included: 1) the strongest predictors of sexual recidivism were measures of sexual deviancy. 2) Failure to complete treatment programs was found to be a moderate predictor, while 3) offence denial, developmental history, low victim empathy, low IQ, and lack of motivation for treatment did not predict sexual recidivism in any meaningful way. The negative findings of this work were almost more important than the positive findings in that offence denial, and the others listed above, were commonly seen as indicating enhanced risk for sexual recidivism and were frequently used as rational to deny men access to treatment or consideration for release. These findings spurred the development of dynamic actuarial prediction instruments for men with histories of sexual offences.
Dynamic Risk Factors
Arguably the most important development in dynamic risk assessment was developed for general criminal offenders – the Level of Supervision Inventory (LSI) developed by Don Andrews and colleagues (Andrews, 1982; Andrews & Robinson, 1984). This instrument contained variables that were truly “variable” – meaning that if observations are made and recorded, changes in the strength (occurrence) of these variables could not only be observed and quantified, but often, change could be attributed to potentially knowable causes. At the same time, other research teams were working on other instruments included dynamic variables: the Violence Risk Scale – Sexual Offense version (VRS-SO, Olver et al., 2007); the Sex Offender Need Assessment Rating (SONAR, Hanson & Harris, 1998, 2001), the Historical Clinical Risk-20 (HCR-20, Webster & Eaves, 1995), the Minnesota Sex Offender Screening Tool (MnSOST, Epperson et al., 1995), and the Sexual Violence Risk-20 (SVR-20, Boer et al., 1997).
Stable Risk Factors
Stable risk factors are personality characteristics, skill deficits, personal predilections, and learned behaviours that directly relate to risk for recidivism. Examples include poor cognitive problem solving skills, becoming emotionally identified with children, and having poor interpersonal skills These types of factors can be changed over time by focusing on improving individual capabilities, usually through efforts to change ones cognitions. Overall this involves learning strategies to fill these deficits, staying away from situations where you cannot manage your behaviour, thinking before acting, and making a concerted effort to practice your new skills over time. The research literature has examples of where an organized, evidence-based, treatment program (Cortoni & Nunes, 2007; Hanson et al., 2009; Hanson et al., 2002; Lösel & Schmucker, 2005; Schmucker & Lösel, 2015).
Acute Risk Factors
Acute risk factors are generally short acting factors of unstable temporal duration that can change rapidly, often as the result of environmental or interpersonal conditions that the offender has no control over (Hanson & Harris, 2000b). Examples of these would be when community supports are no longer available to the released man in the community, or through loss of employment, residence, and other stressors. Acute risk factors are not directly addressed in this paper but are of interest when outlining the development of dynamic risk factors. Please see Hanson and Harris (2001) and Harris and Hanson (2010) for further information on Acute risk factors for sexual recidivism.
The Dynamic Predictors Project (DPP, Hanson & Harris, 1998) used the knowledge gained from the Hanson and Bussière (1996, 1998) meta-analytic studies to design a large multi-jurisdictional retrospective file review and interview study involving probation, police, and parole officers supervising men with a history of sexual offending in the community. A sample of 208 men who had recidivated sexually while on community supervision and 201 men who had not recidivated while on community supervision were matched on victim type, mental health diagnoses, offence history, and victim type at index offence. To develop the officer interviews and the file coding manual, several theoretical and research results were considered: the works of Bandura (1977; social cognitive), Laws (1989; relapse prevention) and studies demonstrating differences between sexual recidivists and non-recidivists (Hanson et al., 1994; attitudes tolerant of sexual assault; Seidman et al., 1994; intimacy deficits). Using these data, Hanson and Harris created the first dynamic risk need assessment rating that included both stable and acute items for men with a history of sexual offending.
The result was the Sex Offender Need Assessment Rating (SONAR, Hanson & Harris, 1998, 2001) containing five stable items and four acute items. The stable items were intimacy deficits, negative social influences, attitudes, sexual self-regulation, and general self-regulation while the acute items were substance abuse, negative mood, anger/hostility, and victim access. All nine items were found to differentiate recidivists from non-recidivists even after controlling for static actuarial scores, age, and IQ, (r = .43, Area Under the Curve, AUC = .74). As the SONAR had both stable and acute factors in the same instrument, after analysis, we separated and expanded stable and acute items forming two new instruments STABLE-2000 and ACUTE-2000.
The Dynamic Predictors Project was retrospective, and some commentators felt that supervising officers might have biased their recollections so as not to look like they had missed something while other commentators worried that only interviewing very experienced officers was “cherry picking”, still others questioned whether it was possible to extract information about psychological characteristics from file data. To smite our adversaries, as they richly deserved, in 1999 Dr. Hanson and I conceived a prospective study called the Dynamic Supervision Project (DSP).
The Dynamic Supervision Project (DSP, Hanson et al., 2007) used a prospective design to assess dynamic risk factors in just over 1,000 males convicted of sexual offences for an average of 43 months as scored by parole, probation, and police officers. These officers were asked to re-score STABLE-2000 every six months. Analysis showed an AUC of .76 for sexual reoffence using scores from all officers. Interestingly, during the analyses (Hanson et al., 2007) a difference between officers who returned the requested forms in a timely manner and those officers who submitted incomplete or late information became apparent. Officers who sent the requested information on time were thought of as “conscientious”. Conscientious officers’ evaluations were more accurate (AUC in the .77 to .80 range). The lesson here is clear - if the scoring of these items is done with due attention, predicting sexual recidivism can be improved. Demonstrating high levels of inter-rater reliability (Hanson et al., 2007), 10 of the 16 items on Stable-2000 showed a linear relationship to sexual recidivism.
Subsequent analysis suggested the need to amend the item inventory of STABLE-2000. In both instruments the items significant social influences and cooperation with supervision are standalone items while the remainder gather under sub-headings of intimacy deficits (5 items), sexual self-regulation (3 items), general self-regulation (3 items), and attitudes supportive of sexual assault (STABLE-2000 only, 3 items) as seen in Table 1. The three items assessing attitudes supporting sexual offending were dropped and the scoring of three other items were refined. Additionally, the scoring system was simplified and the revised measure was called STABLE-2007 (Hanson et al., 2007).
Table 1
Item Comparison of STABLE-2000 and STABLE-2007
STABLE-2000 | STABLE-2007 |
---|---|
(Items = 16) | (Items = 13) |
Significant social influences | Significant social influences |
£ Lovers/Intimate partners | £ Lovers/Intimate partners |
£ Emotional Identification with children | £ Emotional ID childrena |
£ Hostility toward women | £ Hostility toward women |
£ General social rejection/Loneliness | £ General social rejection/Loneliness |
£ Lack of concern for others | £ Lack of concern for others |
» Sex drive/Preoccupation | » Sex drive/Preoccupation |
» Sex as coping | » Sex as coping |
» Deviant sexual interests | » Deviant sexual interests |
§ Sexual entitlement | |
§ Rape attitudes | |
§ Child molester attitudes | |
Cooperation with supervision | Cooperation with supervision |
Θ Impulsive behaviour | Θ Impulsive behaviour |
Θ Poor cognitive problem-solving skills | Θ Poor cognitive problem-solving skills |
Θ Negative emotionality/Hostility | Θ Negative emotionality/Hostility |
Note. £ = Intimacy deficits; » = Sexual self-regulation; § = Attitudes supportive of sexual assault; Θ = General self-regulation.
aThis item only scored for men with a history of sexual offences against children.
STABLE-2007 Application in Indeterminate Detention Cases
One of the arguments against the use of STABLE-2007 with Sexually Violent Persons (SVP), Civil Commitment, Dangerous Offender (DO) designates (Canada), and other Indeterminate Detention (ID) cases is that STABLE-2007 was normed on a community sample; critics argue that men held under indeterminate sentences should be much higher risk. However, the available data from several different jurisdictions show that men at risk for or held for indeterminate periods fall well within the general limits of risk reliably assessed by Static-991 and well within the capabilities of STABLE-2007. Table 2 shows a weighted by sample size average Static-99R score of 4.24 for men involved in indeterminate detention and similar cases.
Table 2
Average Static-99 or Static-99R Scores for SVP Evaluees in Five Studies
Author and Date | Jurisdiction | Sample type | N | Average Static-99(R) |
---|---|---|---|---|
DeClue et al. (2011) | Florida | Recommended for comprehensive SVP evaluation Sexually Violent Predator Program | 178 | 4.7 (SD – NP) Used both 99 and 99R |
Texas | Civilly committed SVP cases | 44 | 5.25 (SD – NP) | |
Wilson et al. (2013) | Florida | Florida secure treatment facility – SVP and civil commitment | 120 | 4.85 (SD = 2.23) |
Canadaa | Federal, maximum secure psychiatric treatment centre | 377 | 4.05 (SD = 2.27) |
|
Eher et al. (2013) | Austria | Hospitalized mandatory treatment | 96 | 4.47 (SD – NP) (Static-99) |
Sandler and Freeman (2017) | New Yorka | Civil management candidates who had sexually recidivated | 207 | 3.50 (SD = 1.99) |
Boccaccini et al. (2019) | Texas | Candidates for SVP commitment | 20 | 4.60 (SD = 1.43) |
Total N | Weighted Average |
|||
1,042 | 4.24 |
Note. Not all those assessed were processed for Indeterminate Detention. NP = Not provided.
aUsed Static-99R.
As documented in Dr. Olver’s paper (2021, this section), correctional jurisdictions typically assess, program, treat, and release men with similar Static-99R scores as a matter of routine. However, doubters remain. When presented with data such as that in Table 2 it is common to hear that these rankings could not apply as men under consideration for SVP, DO, or other forms of Indeterminate Detention usually have some sort of mental abnormality or mental illness concerns that would override the assessment of risk using actuarial measures.
Mental Illness
The above assertion begs the question, just how much evidence is there for the role of mental illness in sexual offender recidivism? I will address this point under four headings: where psychological diagnoses have been used, where psychological diagnoses have not been used, what meta-analysis has to say about this issue, and what the latest research says.
Where Psychological Diagnoses Have Been Used
In the 1990’s some risk assessment methodologies included mental illness items, most prominently, the Risk Assessment Guide (RAG; Harris et al., 1993; Webster et al., 1994). This assessment contained 12 items, of which two were DSM-III diagnoses (3rd ed.; DSM–III; American Psychiatric Association, 1980), “DSM-III Schizophrenia” and “DSM-III Personality Disorder”. The sample of origin for this instrument is directly applicable to SVP, ID, and DO samples as the sample of origin came from the Oak Ridge Division of the Penetanguishene Mental Health Centre, the maximum secure forensic treatment and assessment centre for the Province of Ontario, Canada. This sample consisted of two groups; 332 men admitted and retained for treatment (between 1965 and 1980): the majority of these men were resident in the institution and had been found Not Guilty by Reason of Insanity or Unfit to Stand Their Trial and 286 men who were admitted temporarily for short assessment (30 to 90 days). In this sample a diagnosis of schizophrenia turned out to be a protective factor resulting in the removal of three risk points. In contrast, not having a diagnosis of schizophrenia resulted in the addition of a risk point. The presence of a personality disorder resulted the addition of three risk points while two risk points were removed if the man did not have a personality disorder diagnosis. In subsequent versions, these two items were dropped; however, a diagnosis of Conduct Disorder Prior to Age 15 was added (Harris et al., 2015).
One popular risk assessment instrument that does include a variable called “Major Mental Illness” – the Historical, Clinical, Risk Management-20 (HCR-20, Douglas et al., 2013, 2014) has not performed well in the field. Neal et al. (2015) in an American study of 230 American maximum secure forensic hospital patients found that the HCR-20 administered pre-release did not predict recidivism better than chance. Tully (2017) reviewed a small admission sample at a medium security forensic hospital in Flanders, France, and found at discharge assessment (n = 132) only ‘early maladjustment’ and ‘impulsivity’ were able to discriminate recidivists from non-recidivists. Coid et al. (2011) reviewed HCR-20 data on 1,353 male prisoners in England and found that only eight out of 20 of the test items were individually predictive and neither ‘major mental disorder’ nor ‘personality disorder’ were among that eight.
The Sexual Violence Risk-20 (SVR-20, Boer et al., 1997; Hart & Boer, 2010; S. Hart, personal communication, June 21, 2020) includes the item “Major Mental Illness.” However, in 2010 and 2020 the authors noted that in this 20-item scale Major Mental Illness had the lowest validity, correlating -.06 with sexual violence, but to be fair, the base rate of mental illness in this population was low. Another instrument that includes the item “Major Mental Illness” is the Risk for Sexual Violence Protocol (RSVP, Hart et al., 2003). However, to date there have been no follow-up studies on this instrument and as a result we do not know if this instrument or any of its items predict sexual recidivism. Risk prediction instruments that have attempted to utilize psychological diagnoses have not found the presence of a mental disorder diagnosis to reliably predict recidivism.
Where Psychological Diagnoses Have Not Been Used
The following common risk assessment instruments for men who have sexual offending histories do not contain any mental health items: RRASOR (Hanson, 1997), VRS-SO (Olver et al., 2018; Wong et al., 2003-2017), SONAR (Hanson & Harris, 2000a), VASOR-2 (McGrath et al., 2014), SRA-FV (Thornton & Knight, 2015), SOTIPS (McGrath et al., 2012), RISK MATRIX-2000 (Thornton et al., 2003), Sexual Violent Risk-20 (SVR-20, Rettenberger et al., 2011), the Minnesota Sex Offender Screening Tool (MnSOST, Epperson et al., 1995) and the Minnesota Sex Offender Screening Tool – Revised (MnSOST-R, Epperson et al., 2003). The above information is intriguing due to what these researchers have not found. If mental illness diagnoses were reliably related to sexual recidivism, someone, somewhere, at some point, would have included a mental illness diagnosis in an actuarial assessment, using it to predict sexual recidivism; to date, this has not happened.
Meta-Analytic Findings and Mental Illness Diagnoses
A third source of information are the three meta-analyses on factors related to recidivism. As seen in Table 3, mental health diagnoses are not meaningful predictors of sexual recidivism. An argument could be made for anti-social personality disorder, severely disordered, and any personality disorder until it is remembered that these diagnoses tend to rely upon historically reported factors of observed behaviours indicating a pervasive pattern of violating of the rights of others and an inability to maintain behaviour within societal norms. The small effect sizes seen in Table 3 would most likely be overwhelmed in any regression modeling by more powerful static indicators of social transgression such as criminal history variables. Of the 13 findings from these meta-analyses, leaving the Swedish Records Study aside due to excessive variability in the data, seven of the remaining 12 assessments have non-significant findings and of the remaining five ‘Median ds’, all are small effects (using Cohen’s metric, 1988).
Table 3
Meta-Analytic Assessment of the Role of Mental Illness in Sexual Recidivism
Study | Diagnostic Category | Mdn d | 95% CI
|
n | k | |
---|---|---|---|---|---|---|
LL | UL | |||||
Hanson & Bussière (1998) | Antisocial Personality Disorder | .35 | .07 | .21 | 811 | 6 |
Severely Disordered | .25 | .10 | .40 | 184 | 3 | |
Any Personality Disorder | .26 | .05 | .27 | 315 | 3 | |
General Psychological Problemsa | .00 | -.07 | .07 | 867 | 8 | |
Hanson & Morton-Bourgon (2004) | General Psychological Functioninga | .13 | -.05 | .30 | 1,403 | 8 |
Antisocial Personality Disorder | .29 | .11 | .31 | 3,267 | 12 | |
Any Personality Disorder | .33 | .18 | .53 | 770 | 5 | |
Anxiety2 | .06 | -.18 | .31 | 667 | 6 | |
Depression2 | -.17 | -.34 | .08 | 850 | 7 | |
Severe Psychological Dysfunction | .00 | -.19 | .12 | 1,268 | 8 | |
Mann, Hanson, & Thornton (2010) | Major Mental Illness, Overalla,b | .24 | -.03 | .38 | 2,783 | 9 |
Swedish Record Studyb | .90 | .66 | 1.14 | 1,125 | 1 | |
Other Studiesa | -.03 | -.19 | .12 | 1,268 | 8 |
Note. d = effect size - d values of .20 are considered small, .50 medium, and values of .80 and up are considered large (Cohen, 1988). For ease of comparison, the Cohen’s r values reported in Hanson and Bussière (1998) have been converted to Cohen’s d values using Table 1 from Rice and Harris (2005). The value of d is approximately twice as large as the correlation (r) from the same data.
a95% Confidence Intervals that contain .00, by definition, are not statistically significant (bolded).
bThe Q statistic – represents variability across studies – where Q is significant there is reason to believe that variability is greater than that anticipated by chance and that the data may contain outliers. For the Swedish Record Study, the entry Major Mental Illness has a significant Q = 41.06, p < .01 leading us to believe that the Swedish Record Study contains outliers. Långström et al. (2004) found a large effect in the Swedish study but others have not been able to replicate this finding.
Note. k = number of studies.
Mental Illness and Recidivism Risk
Aelick et al. (2020) report that in a sample of 409 men followed for an average of 11 years, severe mental illness diagnoses were not associated with sexual recidivism. However, histrionic, and narcissistic personality disorders did predict sexual recidivism. Some other personality diagnoses were initially correlated with recidivism, but their variance was captured by Static-99R scores and substance abuse in the regression model. The Aelicka et al. paper suggests further research is required before one can assume that mental illness is related to sexual recidivism.
In Summary
These four findings: those that did, those that did not, metanalytic data, and recent research, indicate that case conceptualizations that weigh not only actuarial scores but also assess negative weight to the presence of a mental disorder are likely to overestimate sexual reoffence risk.
Stable Dynamic Studies
Over time some studies have produced differing outcomes on the utility of STABLE-2007. While one study by Eher et al. (2012) found STABLE-2007 to predict all types of recidivism and found incremental predictive validity over Static-99, another assessing a high-intensity treatment program found STABLE-2007 scores did not predict any type of recidivism. When faced with contradictory findings we turn to meta-analysis.
A meta-analysis by Brankley et al. (2021) based on 12 unique samples (N = 6,955) found STABLE-2007 differentiated between sexual recidivists and non-recidivists. Each of the 13 items of STABLE-2007 predicted sexual recidivism, and its use improved predictive ability above Static-99R alone. Brankley et al. (2021, p. 11) report “fixed-effect weighted hazard ratio for sexual recidivism was 1.12 (95% CI [1.10, 1.14], k = 12), meaning that there was a 12% increase in the likelihood of sexual recidivism for every unit increase on STABLE-2007”. They also reported STABLE-2007 scores added incremental predictive validity beyond that of Static-99R alone, using a fixed effect weighted hazard ratios [HR] 1.07 (95% CI [1.04, 1.09], k = 12), such that for every increase in STABLE-2007 total score there was a 7% increase in the risk of sexual recidivism. Of great interest to the issue of whether the STABLE-2007 should be employed with men in indeterminate and long-term detention is this analysis found predictive accuracy of STABLE-2007 was the same in routine and high-risk/high-need samples and levels of prediction were consistent across studies. Overall, STABLE-2007 items predicted sexual recidivism and when moderator analysis was completed, checking for allegiance effects, it was found country of origin of studies (Canada vs. other) was not statistically significant.
These findings are strengthened by a prior, more broadly based, meta-analytic study (van den Berg et al., 2018) testing predictive outcomes from 14 different dynamic instruments (N = 13,446), based upon 41 samples finding a fixed-effect weighted Cohen’s d of 0.71 based upon 22 unique samples, and a fixed-effect weighted hazard ratio of 1.08 based upon 13 unique samples. Of note is that the Brankley et al. (2021) meta-analysis used 10 of the same samples as were used in this paper.
Predictive and Incremental Validity
Etzler et al. (2020) added to their 2012 data set to study the predictive and incremental validity of STABLE-2007 beyond Static-99 with a sample of N = 638 adult males who had been convicted of a sexual offence and followed for an average of 8.2 years. This dataset was included in the Brankley et al. (2021) meta-analysis. These authors concluded STABLE-2007 total scores added information beyond that available from Static-99 scores alone. This finding is supported by van den Berg et al. (2018) who found incremental predictive validity of dynamic instruments above static instruments with a Cox hazard ratio of 1.08 for sexual recidivism based upon 13 unique samples. The Etzler paper is highlighted here due to three particularly helpful additional analyses: a factor analysis, a table of STABLE-2007 inter-item correlations, and a calibration analysis comparing predicted recidivism rates to observed recidivism.
The factor analysis revealed three factors, 1) Antisociality, 2) Sexual Deviance, and 3) Hypersexuality; the subsequent factor rotation showed the three factors to be almost uncorrelated. The table of item correlations shows intercorrelations to be small “suggesting different item content and little common variance” (p. 13). These two analyses suggest STABLE-2007 broadly samples from multiple recidivism-relevant constructs. Calibration Analysis showed that in four of the five nominal risk categories (well below average risk, below average risk, average risk, above average risk, and well above average risk; Hanson, Bourgon, et al., 2017), STABLE-2007 overestimated the recidivism risk compared to levels of observed recidivism but that none of the differences were statistically significant. These authours concluded Static-STABLE nominal categories “could be regarded as adequately calibrated” (p. 12); and that the findings support “the clinical utility of the application of STABLE-2007 nominal risk categories for predicting sexual reoffences” (p. 13).
A Network Analysis
Recently, an innovative statistical study was completed by van den Berg and colleagues (2020) assessing the inter-relationships and dependencies within STABLE-2007. Using the data set from the Dynamic Supervision Project (DSP; Hanson et al., 2007) this team applied network analysis to reveal underlying relationships between STABLE-2007 items. Network analysis allows the assessment of complex connections between many items (McNally, 2016); hopefully leading to an improved understanding of the causal pathways and processes involved in the relationship between recidivism and sexual offending. These authours found the strongest relationship between impulsive acts and sexual recidivism. General rejection and loneliness, poor cognitive problem solving skills, and impulsive acts were found to be in a “central position” in these analyses suggesting that intervention and/or treatment on these issues may well affect the saliency of other risk factors. The authours conclude treatment should “focus on impulsive acts, improving cognitive problem solving, reducing feelings of loneliness, and stimulating reintegration” (p. 187).
This section on meta-analysis and recent statistical research demonstrates the continued interest in, and evaluation of, these dynamic instruments signifying that these instruments are evolving with the populations they assess and that continuing research will further sharpen them.
Common Language for Risk Assessment
One of the advances in risk assessment is the development of a common scientific language for the representation of risk levels in all offenders. In the past, designation of such arbitrary nominal categorizations as “Low Risk”, “Moderate Risk”, and “High Risk” had different meanings to different researchers, clinicians, politicians, and especially, the public. To improve this non-system, in 2014 the Council of State Governments Justice Centre convened a meeting to develop a common language of risk that could be applied to all actuarial risk measures, developing a standard of practice. This system bases risk categories in recidivism data, is easy to implement for all scales/instruments, and allows for more precision in the communication of risk information, please see Hanson, Bourgon, et al. (2017).
Briefly, this five-level system was adjusted for men with a history of sexual offences. Based upon recidivism data, Level I, the ‘very low risk’ level represents about the lowest 7% of the recidivism distribution. Level II, the ‘below average risk’ level represents approximately the next higher 18% of the recidivism distribution. Level III, the ‘average’ recidivism risk represents approximately the middle 50% of the recidivism distribution. Above that is Level IVa, the ‘above average’ level of recidivism risk and occupying another approximately 18% of the recidivism distribution, and above that again, Level IVb, the ‘well above average’ recidivism risk group taking up approximately the top 7% of the recidivism distribution. These levels of risk and logistic regression estimates of sexual recidivism risk for five- and ten-year follow-up can be seen in Appendix A1 and A2 of Lee and Hanson (2021).
In addition, STABLE-2007 scores can be combined with Static-99R scores to produce an overall combined prediction of risk estimate, still using the levels system, that is helpful for treatment and supervision staff to define and rank men based upon static and stable risk. Looking at Table 4, a man who scored as a “2” (two points, the median score) on Static-99R and scores between three (3) and eleven (11) points on STABLE-2007 – has an overall combined risk estimate of ‘Average” (Level III). However, if he scores a twelve (12) or above on STABLE-2007 he moves up a risk level to ‘Above Average Risk’ (Level IVa). If he only scores a 0, 1, or 2 on STABLE-2007 then his overall combined risk estimate would move down to a ‘Below Average Risk’ (Level II), please see Brankley et al. (2017) Appendix C. Recidivism estimates for each of these risk levels are shown in Table 5 of Brankley et al. (2017). It should be clear that these risk levels not only indicate a man’s objective risk estimate but also represent a metric for how to assign treatment and supervision resources in the community following the risk principle (Andrews & Bonta, 2010, 2015).
Institutionalized STABLE-2007 Risk Assessment
Scoring STABLE-2007 within the institution is not as difficult as some believe. However, there are two immediate caveats, first it does take some interpretation of the guidelines from the manual (Fernandez et al., 2014); second, it is easier for a staff member at the institution who has relatively easy access to other staff and the ability to access institutional records. The short examples below are merely common situations and are not representative of all possible scoring situations; there are many more techniques for assessing each of the STABLE-2007 items than the few obvious ones listed here.
Within the institution Significant Social Influences can often be easy to score as most institutions have visitor logs, call logs, and may have surface mail logs. For example, if a man in his interview tells you that he has constant support from his uncle, yet there is no record of his uncle entering the institution nor have there been any mail or phone calls you are correct in suspecting that uncle’s support may not be as constant as the man contends. On Capacity for Relationship Stability, it is almost impossible for a man to score “no problem” – a ‘0’. Here you assess whether he has an established pre-sentence intimate partner, of any gender, who provides substantial long-term support, continuing while this man is incarcerated; this pattern scores a ‘1’. Hostility to Women can often be assessed within the institution by his differential reaction to male and female staff, especially in terms of derogatory/inappropriate comments about females, complaints against female officers or staff, and ignoring orders from female officers but complying with orders issued by a male officer. Lack of Concern for Others can be assessed on the unit if the man is particularly ruthless, cold, unfeeling, takes advantage of weaker or inadequate men and may steal on the unit. Impulsive Acts include slamming out of meetings or sessions, especially when the issue has not yet been settled, being quick to join anti-management issues without obvious thought, and getting caught holding contraband for someone else. Negative Emotionality is scored by considering issues such as whether he feels everyone is out to get him, tendencies toward paranoia, and holding a grudge against the world. Sex Drive and Pre-occupations may be assessed by casual/impersonal sexual activity on the unit, extensive sex talk on the unit, or on the obverse, insisting that he has no sexual thoughts or strenuously insisting that sexual thoughts and/or behavior are “evil” or “sinful”. Deviant sexual interests can be recognized and scored from a behavioral history of deviant victim preference when there is no reliable self-report or phallometric assessment.
Table 4
Combining STABLE-2007 Scores With Static-99R Scores
Note. From Brankley, A. E., Helmus, L. M., & Hanson, R. K. (2017). STABLE-2007 Evaluator Workbook: Revised 2017. Available at: http://www.STATIC99.org. Reprinted with permission.
Finally, a lack of Cooperation with Supervision can be recognized in the man who does not engage with his correctional plan, sees no need for treatment/groups, insists that he cannot talk about his crimes at the behest of their lawyer even though they were convicted years ago, or that it is between him and a supreme being now. Conversations with institutional parole officers, charge correctional officers, workshop instructors, and other staff within the institution may also be enlightening.
Successful scoring of STABLE-2007 has been shown viable in several high security settings as seen in the studies by Eher et al. (2012, 2013); Fernandez (2021, this section); Looman and Abracen (2012); Looman and Goldstein (2015); Sowden (2013); and Sowden and Olver (2017). These studies show that STABLE-2007 scores are significantly and incrementally related to sexual recidivism.
Nine Reasons to Utilize Dynamic Assessment in Indeterminate Detention Cases
(1) Organizational Effectiveness
Having a defensible, organized, and managerially supported assessment methodology provides common templates of risk assessment within the organization, improving communication, understanding, and supporting organizational goals and the mission statement. STABLE-2007 is the most widely used measure of dynamic risk for men with a history of sexual offending in Canada and the United States (Bourgon et al., 2018; Helmus, Hanson, et al., 2012; Kelley et al., 2020; McGrath et al., 2010) and brings a common language of risk communication (Hanson et al., 2017) to jurisdictions that use these instruments.
(2) Systemic Operational Efficiency
The best way to prioritize correctional resources is Andrews’ Risk Principle so that for the lowest level of tax-payer investment, society receives the highest level of public safety (Andrews & Bonta, 2010, 2015; see Risk, Need, Responsivity, and Therapeutic Engagement principles). Some men under indeterminate detention will eventually be released to the community. Public safety is enhanced when dynamic assessment allows individual risk assessors to reliably rank men in terms of risk for sexual reoffence. From this solid foundation the organization can decide how best to deploy their resources, mainly staff time and treatment placements following the Risk Principle. It is grossly inefficient from both a risk management perspective and from a resource utilization perspective to spend the same amount of time and resources regardless of the risk the individual offender poses. Dynamic risk assessment provides treatment providers and institutional supervisors instruments to rationalize and manage treatment resources.
(3) Repeated Assessments
An important feature of dynamic risk assessment is that with repeated assessments at spaced time intervals it is possible to track changes – for better or worse – in the man’s risk level over time. Unlike static measures, dynamic measures are expected to change and do change. There is reason to believe that positive changes may be accelerated through treatment and that for measurable reductions in dynamic scores there will be a concomitant reduction in the risk of recidivism (van den Berg et al., 2018).
(4) Dynamic Assessment Time Frames
The STABLE-2007 assessment uses all available information but places most weight on the last year, looking forward to what the man will most likely be like in the coming year. In the usual course of community supervision stable factors should be reviewed once a year, creating an implicit timeframe for stable assessment as “a year either side of today”. The current recommendation is STABLE-2007 be reviewed once a year for each offender except for long-term incarcerates who are making no efforts and attending no programs or interventions; in these cases, every two years should be sufficient.
(5) Treatment Efficiency
Treatment assignment is made easier using the thirteen STABLE-2007 items; each of which represents the best empirically validated sex offender treatment targets for that individual (Hanson et al., 2007), resulting in institutional personnel being able to select treatment targets more effectively and assign the man only to those resources that he requires. Further, studies from the United Kingdom and Ireland found STABLE-2007 expanded the range of issues addressed in case reports and improved confidence in decision making (McNaughton-Nicholls et al., 2010; Walker & O’Rourke, 2013). Briken and Müller (2014) reported that German treatment providers found STABLE-2007 items useful in assessing the severity of paraphilias.
(6) Professional Knowledge Dissemination and Improving Officer Expertise
Knowledge dissemination within an organization and the mentoring of new staff is of concern to all institutions and jurisdictions. The combination of static assessment, stable assessment, and using the levels system (Hanson, Bourgon, et al., 2017) provides the novice officer with a framework from which to organize their overall assessment of the individual’s risk, treatment, and supervisory needs. It provides the experienced officer with the documented justification and research for complicated judgements in complex cases and provides the institution with a standard reference such that risk communication throughout the institution and jurisdiction is consistent and treatment recommendations understood by all.
(7) Officer Support and Resources
STABLE-2007 was designed for scoring by front line officers and provides a comprehensive suite of resources such as coding manuals, tally sheets, exercises, evaluator workbooks, and free instructional videos on YouTube.com. In addition, online e-mail support is available from the Static Administrator at staticquestions@gmail.com. Materials are provided at no cost but training to administer these instruments is required and costs may be associated with obtaining that tuition. The organization benefits from having assessments scored by the personnel who have the most interaction with the men, the most access to his situation, and the most up-to-date knowledge of his current level of functioning. Research has shown that with training and managerial support basic sex offender risk assessments can be completed by front-line staff with as much accuracy as basic assessments completed by more expensive clinical staff (Hanson et al., 2007).
(8) Transparency and Objectivity
Use of a reputable dynamic risk assessment system greatly improves objectivity and transparency. Staff and the man can tell what decisions and recommendations are being made and why. Decision processes based on empirical evidence, objectively scored, providing the man with information that is rational and explainable. This improves trust and leads to increased engagement. It also allows the man and the supervisor to come to agreement on concrete behavioural targets, goals, and expectations.
(9) Development of Local Norms
The reluctance to use norms developed on community probation and parole samples with indeterminate detention cases is understandable, and there is no pretense that this is an optimal solution. Available datasets of men released to the community from indeterminate detention are few. Hanson and colleagues have always suggested the development of local norms to overcome any sample bias created by legal, sentencing, or release considerations. An example of this is that Texas, USA, has completed local norms for Static-99R (Boccaccini et al., 2017, 2019) and all jurisdictions are encouraged to research and document local Indeterminate Detention norms from those men who are released. However, a large state, province, or even country could develop local norms at very little cost by engaging with universities and utilizing graduate student labour. Additionally, while there are admittedly problems when dealing with SVP, DO, and ID populations, the development of local norms would ameliorate the historical default of just labelling all cases as high risk.
Further Research
There are a multitude of possible research projects that could be carried out on these instruments and benefiting men found in indeterminate detention. I will suggest the four most important. Due to the limited number of men released each year from any given jurisdiction’s ID programs the development of a multi-jurisdictional research program to provide sufficient data for an effective recidivism follow-up over a long term is critical to longer-term risk management of this population. This would be difficult as it would require an addition to the standard research group, some sort of a political veteran who could convince differing jurisdictions to join the project.
Within jurisdictions that process out enough men to create local norms this would be a valuable process as it would allow for sampling differences between jurisdictions caused by political, legal, or local practice differences. Again, within any given jurisdiction it would be helpful to be able to follow changes in dynamic scores over repeated assessments while the man is in custody and determine a number of outcomes. Do dynamic scores provide additional predictive validity above and beyond static assessment? Do men who participate in treatment programs based upon stable dynamic factors do better than men who pick and choose treatments based upon their own understanding of their treatment needs, or men who do not engage in treatment at all? And do men who engage in organized and directed treatment while still under ID status get out earlier and do they fare better once in the community? Finally, following the example set by Aelick et al. (2020) the role of a diagnosis of a mental disorder in sexual recidivism requires independent replication in multiple jurisdictions, preferably with significant international representation.
Conclusions
STABLE-2007 items predict sexual recidivism and STABLE-2007 scores provide incremental validity to Static-99R scores. STABLE-2007 items are important indicators of criminogenic need and as a result provide guidance in target selection for both community and institutional supervision and interventions/treatment. Rettenberger and Craig (2020) state that dynamic risk factors “have to be regarded today as a cornerstone of effective treatment program implementation” (p. 93). STABLE-2007 assessments done at regular intervals also allow for an indication of improvement or deterioration of the offender’s functioning and hence, suggest changes to his current management regime and his overall sexual recidivism risk. Having treatment and intervention plans assessed early in stay allows for greatly increased organizational effectiveness as these assessments provide a common language of risk within the organization, increase the range of issues addressed in reports, and lead to improved confidence in decision making among front-line staff. Operationally, these improved treatment plans can be reviewed for a whole cohort and treatment groups and activities planned based upon known needs of the men currently incarcerated. Long stay institutions should adopt a validated measure of dynamic risk and score it repeatedly over appropriate time intervals to guide treatment and supervision and to assess change in recidivism risk over time.