Factor Structure of the J-SOAP-II Among Black and White Male Youth: A Confirmatory and Exploratory Factor Analysis

The Juvenile Sex Offender Assessment Protocol-II (J-SOAP-II) is a tool used to aid clinicians in assessing the sexual and criminal reoffense risk of male youths who have committed a sex offense. Despite its popularity, the factor structure has not been thoroughly assessed. The present study used confirmatory factor analysis (CFA) to test the factor structure of the four subscales of the JSOAP-II in a group of youths aged 12-18 who were confined for sexual offenses (N = 909), and whether the fit is affected by youth race. The results showed a poor fit to the data. An ad-hoc goal was added, to propose a new factor structure using exploratory factor analysis (EFA) on one half of the data, and CFA on the second half of the data. The EFA identified three-factors: Sexual Offending and Victimization History, Risk for General Delinquency, and Antisocial Beliefs and Attitudes. This three-factor model, provided an improved, but not good, fit, indicating that further modifications to the J-SOAP-II are required to meaningfully capture risk-relevant latent constructs.

al., 2018). Several researchers found that the total score is a moderate predictor of sexual recidivism with AUCs ranging from .63 to .83 (Aebi et al., 2011;Barra et al., 2018;Martinez et al., 2007;Prentky et al., 2010;Rajlic & Gretton, 2010). Others found that the total score does not predict sexual recidivism (Caldwell et al., 2008;Fanniff & Letourneau, 2012;Parks & Bard, 2006;Viljoen et al., 2008). Perhaps not surprisingly, the findings on predictive validity associated with each subscale are similarly mixed (Chu et al., 2012;Parks & Bard, 2006;Prentky et al., 2010;Viljoen et al., 2008). For example, Chu et al. (2012) observed that only the first subscale was predictive of sexual recidivism (AUC = .72), while the other three subscales ranged from an AUC of .37-.55. Conversely, Wijetunga and colleagues (2018) found that only the first subscale was not predictive of sexual recidivism (AUC = .27), while the AUC values for the other three subscales ranged from .72 to .88 (Chu et al., 2012;Parks & Bard, 2006;Prentky et al., 2010;Viljoen et al., 2008). Expanding on previous work focused on recidivism during adolescence, a recent study evaluated the long-term predictive validity of the J-SOAP-II into adulthood. Results indicated Sexual Drive/Preoccupation was associated with sexual recidivism, while the other three subscales were associated with nonsexual violent recidivism (Schwartz-Mette et al., 2019). To date, there is a paucity of research that examines whether and how a youth's race impacts the psychometric properties of the J-SOAP-II.
Identifying latent constructs for static actuarial adult sex offender recidivism risk scales has been ongoing for over a decade (Allen & Pflugradt, 2014;Barbaree et al., 2006;Brouillette-Alarie et al., 2016;Roberts et al., 2002;Rohrer, 2019). At least two relevant risk factors are usually found: antisocial behavior/general criminality, and sexual deviance. More recently, there has been an increased focus on latent constructs in dynamic risk assessment instruments for adults with a sexual offense. Etzler, Eher, and Rettenberger (2018) found three latent constructs in the Stable-2007 (i.e., Antisociality, Sexual Deviance, Hypersexuality); Antisociality and Sexual Deviance significantly predicted recidivism. Olver and Eher (2019) used Confirmatory Factor Analysis and noted three latent constructs comprising the Violence Risk Scale-Sexual Offence version (VRS-SO) fit the data well: Sexual Deviance, Criminality, and Treatment Responsivity.
Comparable evaluations were recently conducted in youth risk assessment tools. Martin (2017) found four latent constructs present in the JSORRAT-II: Sexual Offending History, General Criminality, Familial Abuse, and Educational Disruptions. Rojas and Olver (2019) used EFA and observed three latent dimensions in the Violence Risk Scale -Youth Sexual Offense version (VRS-YSO), namely: Sexual Deviance, Antisocial Tendencies, and Family Concerns. Altogether, findings from these studies indicate that the Antisociality and Sexual Deviance findings from the adult literature translate to youth risk, but also recognize that there are also additional factors that contribute to risk for sexual offending among youth. In particular, the youth's family appears to impact youth sexual offending and sexual reoffending.
Risk assessment among youths is fraught with limitations, often stemming from the youth's developmental stage. The dual systems model of adolescent risk-taking posits the primary reason adolescents are at higher risk for engaging in offending behaviors is a lag between two neural systems: the socioemotional system (which increases reward-seeking behavior) and the cognitive control system (which increases self-regulation and impulse control) (Steinberg, 2010). The theory recognizes that adolescents have increased dopaminergic activity within their socioemotional systems and delayed maturation of their cognitive control systems, which results in adolescents being biased toward reward seeking and having a developmental delay in their ability to inhibit risk-taking (Steinberg, 2010). As the youth ages and their cognitive control system matures, their participation in risky behaviors -and often any offending behaviors -subside. Altogether, the dual systems model suggests a youth's reoffense risk is likely to change rapidly over timenamely increasing during adolescence and decreasing at the end of adolescence/young adulthood, even without outside intervention.
Heterogeneous and differential developmental maturation of a youth's cognitive control system essentially translates to infeasibility of the use of strict risk categories (Kang et al., 2019). It also means that many youths who engage in offending behavior may be doing so in part because they are unable to properly inhibit their behavior.
Development of the cognitive control system likely impacts youth's sexual behavior similarly to other risky behavior. Moreover, youth who engage in offending behaviors are an extremely heterogeneous group; accordingly, not only are the presenting reasons behind their inappropriate sexual behavior (i.e., their risk for engaging in such behavior) varied, but their treatment and rehabilitation needs are also highly variable. Moreover, assigning a youth to unnecessary intensive treatment is a waste of time, resources, and may even increase their risk of reoffense (Lowenkamp & Latessa, 2004). Additionally, the base rate of adolescent sexual reoffending is incredibly low (Caldwell, 2016), and thus any attempt at predicting a reoffense will be rife with false-positives, capping risk assessment validity. In light of the developmental limitations that can impact risk, focusing on predicting sexual reoffending may not be a valuable use of time; rather than using risk assessment scales to predict risk, Kang and colleagues (2019) recommends focusing on remediation, which requires a strong understanding of the individual treatment needs of the adolescent.
One model to guide the allocation of treatment is the risk-need-responsivity (RNR) model (Andrews & Bonta, 2010). In this model, risk refers to matching the level of treatment to the individual's own reoffense risk factors. Need focuses on the specific dynamic risk factors that the individual endorses. Responsivity tailors the treatment to the learning styles and strengths of the individual. Treatments that adhere to the RNR principles demonstrate a greater reduction in recidivism than treatments that do not. The J-SOAP-II can be used to guide an RNR-based treatment. Higher scores of the J-SOAP-II can be used to convey risk, while endorsement of the specific items can be used to guide Factor Structure of the J-SOAP-II treatment based on need. One drawback of this approach is that the static items are unable to be used to guide the need principle.
The division of static and dynamic factors in risk assessments associated with sexual offending makes intuitive sense, as static items may contribute to risk but are unable to be modified, while dynamic items can provide a clear treatment target. Factor analysis on static items is an attractive option, as static items are unable to be modified, but may be tapping into a latent construct that is responsive to treatment. However, there is no reason to suspect that static and dynamic items represent different latent constructs, so identification of risk-relevant constructs may combine both static and dynamic items.
Of additional concern in risk assessment research for youth with sexual offenses, race is underexamined in the literature. Although studies of the J-SOAP-II often include racially diverse samples, they frequently do not capitalize on this strength, and fail to examine the impact of race on the psychometric properties of the J-SOAP-II. For example, several studies with racially diverse samples of youth did not test the predictive validity of J-SOAP-II scores on reoffense risk (Barroso et al., 2019;Caldwell & Dickinson, 2009;Martinez et al., 2007). Recent research indicated that Black youths score higher than White youths on the J-SOAP-II, though it remains unclear whether these score differences represent a true difference in risk (Fix et al., 2017). In clinical and other real-world settings, such differences can profoundly impact outcomes for youth, like playing a role in determining if youth should be confined or put into outpatient treatment or impacting judicial sentencing. Accordingly, research is needed to identify whether and how psychometric properties of the J-SOAP-II are commensurate across racial groups.
Additionally, the relationship of the home environment on compliance is more robust among Black youths in comparison with White youths in a sample of general adjudicated youths (Caldwell et al., 2007). Considering that the J-SOAP-II contains multiple items that examine the home environment, it is possible that this difference would impact the performance of the J-SOAP-II. However, no research to date has tested this phenomenon.
Other psychometric research on the J-SOAP-II remains incomplete. Prior research has established convergent validity by correlating the J-SOAP-II with other juvenile risk assessment measures (e.g., JSORRAT), or measures looking at similar but conceptually differentiated constructs (e.g., PCL-YV). Results from related research consistently demonstrate moderate and statistically significant positive correlations with other risk assessments, or constructs (see Barroso et al., 2019). Further, it remains unclear how to best use the J-SOAP-II as a marker of risk for sexual recidivism. Indeed, Prentky and Righthand (2003) both noted they had not identified cut-off scores denoting risk categories for the J-SOAP-II, and also stressed that the J-SOAP-II should be used in conjunction with other risk assessments.

Current Study
Despite the multitude of studies on the utility of the J-SOAP-II, we are no closer to establishing meaningful cut-off scores, or the ability to use the J-SOAP-II as an independent assessment. One fruitful area of research would be to focus on clear targets for revision to the J-SOAP-II. Additionally, despite research observing racial group differences on the J-SOAP-II and other risk assessments, little is known about whether and how scores on the J-SOAP-II are impacted by youth race. In response, the primary goal of the current study was to examine whether the J-SOAP-II is tapping into a risk-relevant latent construct, and the secondary goal was to determine whether the latent construct was comparable for Black and White youths.
As the factor structure of the J-SOAP-II was determined using principal components analysis, we first examined the model fit in a group of youths aged 12 -18 who were confined for sexual offenses by using confirmatory factor analysis (CFA) to determine if the J-SOAP-II was already tapping into risk-relevant latent constructs. To accomplish our secondary aim of testing if the model fit was affected by youth race, additional CFA models were run to compare the model fit of Black youths to White youths. Lastly, due to poor results following analyses aimed at the first two goals, a post-hoc goal was added: to seek out a more optimal factor structure through the use of exploratory factor analysis (EFA).

Method Participants
Study data came from 909 male youths, aged 12 -18, confined in a juvenile correctional center within a Southeastern state for a convicted sex offense. Of these youth, 411 were Black, 462 were White, and 36 were of another race/ethnicity. Similar to procedures described in Fix et al. (2017), all participants were required to provide assent and to have a legal guardian provide consent to have their data included in the present study. Following assent and consent, a psychological evaluation was provided to each juvenile in order to individually tailor their court-ordered treatment. Evaluations were carried out independently by trained graduate clinicians, approximately 2 weeks after the juvenile's arrival at the detention facility; interrater reliability was not calculated. The evaluation included administration of several standardized measures, including the J-SOAP-II. Participants were made fully aware that although the evaluation process was necessary to inform their treatment, use of their data for research purposes was entirely voluntary.
Factor Structure of the J-SOAP-II

Measures Juvenile Sex Offender Assessment Protocol-Revised (J-SOAP-II)
The J-SOAP-II is a 28-item checklist designed to be administered by a trained clinician in order to assess risk factors related to sexual and criminal offending in adolescents (Prentky & Righthand, 2003). The J-SOAP-II has a Total J-SOAP-II Score and four subscales: Sex Drive/Preoccupation, Impulsive/Antisocial Behavior, Intervention, Community Stability/Adjustment. Internal consistency on these subscales has been observed to be between .56 and .91 (Fanniff & Letourneau, 2012). In addition, the J-SOAP has demonstrated concurrent validity with comparable measures and sexual offense information (Prentky & Righthand, 2003). J-SOAP-II scores were obtained prior to the onset of treatment in the current study.

Statistical Analysis
There were three steps to our analyses. First, Confirmatory factor analysis (CFA) was used to test if the proposed factor structure for the J-SOAP-II appropriately fits the data. Second, the dataset was divided in White and Black youths, and the factor structure was assessed separately by race. Third, the full dataset was divided in half randomly, exploratory factor analysis (EFA) was used on the first half of the dataset to identify a new factor structure, and CFA was used on second half to assess the new factor structure. All analyses were done using MPlus v8.1, in conjunction with R v3.5.2, RStudio v1.1.447, and MPlusAutomation v0.7-2, a package which enables R to run and read MPlus files (Hallquist & Wiley, 2018;Muthén & Muthén, 1998-2017R Core Team, 2018;RStudio Team, 2015).

Missing Data
Missing data were handled with two methods. When appropriate, multiple imputation (MI) was used. When multiple imputation was not appropriate, pairwise deletion was used. Multiple imputation was not possible for scale items in the J-SOAP-II when the majority of the scale was missing. One disadvantage of MI is that the fit statistics were not developed for this technique, and so the fit statistics provided are the mean over five imputations.

Confirmatory Factor Analysis
Confirmatory factor analysis determines how well a prespecified model fits the data by assessing whether the observed deviations from the model are greater than what would be expected by random chance. A four factor CFA model was specified as described by Prentky and Righthand (2003). Model fit was determined by using the model χ 2 -test, the Tucker-Lewis index (TLI), the comparative fit index (CFI), weighted root mean square residual (WRMR), and the root mean square error of approximation (RMSEA). The model chi-squared test compares the specified model to the null model, and a significant (< .05) p-value is indicative of a poor fit. The χ 2 -test is highly sensitive to large sample sizes, so the values of CFI & TLI > 0.95, WRMR < 1.0, and RMSEA < .06 were also used to define a good fit (Reeve et al., 2007;Yu, 2002). Due to the items being ordinal, the factor analysis was based on a polychoric correlation matrix (Holgado-Tello et al., 2010). The mean and variance adjusted weighted least squares (WLSMV) estimator was used, as it is designed for ordinal data, and outperforms the robust maximum likelihood (MLR) when the sample size is large (Li, 2016).

Exploratory Factor Analysis
Based on the results of the CFA, a post-hoc goal of identifying an alternative factor structure for the J-SOAP-II was included. The dataset was randomly split in half. Exploratory factor analysis (EFA) was used on one half of the dataset to identify a new factor structure, and the newly proposed factor structure was tested on the second half of the dataset using CFA. As opposed to CFA, which tests a prespecified model against the data, EFA proposes model structures based upon the item correlations within the data. The most parsimonious model was selected by examining the eigenvalues, scree plot, and parallel analysis. Items were then removed based off of low factor loadings (< .3), or high cross-loadings (> .4 on two or more factors). The geomin oblique rotation was selected due to high correlation between the factors. With the other half of the dataset, a CFA was run with the newly proposed factor structure, using the same methods described above.

Demographics
Demographic information specific to the study sample is provided in Table 1. A number of different demographic and other clinical data were obtained via a 90-min semistructured interview during the evaluation. Data included in the present study pertain to participant race/ethnicity and age. Overall, Black and White youth participants were similar in age, though the difference was statistically significant. Black youths also had statistically significantly lower mean score for Scale 1, but higher mean score for Scales 2, 3, and 4. Our sample did not include many youths from other races/ethnicities, so a meaningful sub analysis could not be conducted using them.

Factor Structure of Model #1
Confirmatory factor analysis was used to assess whether Model #1, the four-factor model proposed by Prentky and Righthand (2003), fit the data collected on 909 youths, aged 12-18, who had committed a sex offense.
Model #1 includes 28 items, and four latent variables (factors). The four factors correspond to the four subscales of the J-SOAP-II, which are the Sexual Drive/Preoccupation, Impulsive/Antisocial Behavior, Intervention, and Community Stability/Adjustment Scales. All factors were allowed to be inter-correlated with each other, resulting in six intercorrelations. The same model structure was then run on the 462 White youths and 411 Black youths separately to determine whether the fit was different between groups.
Model fit was based upon multiple fit indices, Chi-square, RMSEA, CFI, TLI, and SRMR. Each of these fit indices indicated a poor fit in the complete sample, as well as the White and Black sub-samples (see Table 2). The model fit for Black youths was decreased as compared to White youths and the total sample, but as it provided a poor fit for both White and Black youths, there was little to be gained from a more in-depth examination of measurement invariance. .108 Note. EFA = exploratory factor analysis; CFI = comparative fit index; TLI = Tucker-Lewis Index; RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual.
The factor correlations and standardized geomin oblique rotated factor loadings were derived for the complete sample only (see Table 3).  (Muthén & Muthén, 1998-2017. Items 6, 9, and 16 loaded poorly onto their respective factors (< .4). The correlation between Factor 2 and Factor 4 was very high, while Factor 3 was moderately associated with both Factors 2 and 4 (see Table 4)

Exploratory Factor Analysis
Due to the poor fit of the CFA model, a post-hoc goal of determining a new factor structure was added. The dataset was divided in half randomly so that one half (n = 454) could be used to propose a new factor structure using EFA, and the second half (n = 455) could be used to test that factor structure using CFA. The number of factors extracted was based upon the eigenvalues, scree plot, parallel analysis, and factor loadings. The eigenvalues showed 7 factors above 1.0. The scree plot did not show a clear cut-off, but the parallel analysis line indicated a four-factor solution. After examining the factor loadings of the three and four-factor solution, only the three-factor solution had at least 3 items uniquely load onto each factor (> .4 on one and only one factor). Geomin rotated factor loadings are presented in Table 5. Most items loaded onto the same factor as the original structure, albeit with several significant changes. In Factor 1, Items 1 (prior legally charged sex offenses), 6 (sexualized aggression), and 8 (sexual drive & preoccupation) were dropped due to low factor load-ings. Factor 2 is primarily comprised of items from the Impulsive/Antisocial Behavior and Community Stability/Adjustment scales, though also included Item 23 (quality of peer relationships), and dropped Items 9 (caregiver consistency), 16 (history of physical assault and/or exposure to family violence), and 24 (management of sexual urges and desire). The only change to Factor 3 was that Item 23 moved to Factor 2. This made for a total of six items dropped, two items moved, and two factors merged into one.

Factor Structure of Model #2
Confirmatory factor analysis was used on the second half of the dataset to test the newly identified factor structure. Model #2 includes 22 observed variables, and three latent factors. The three factors were all free to be intercorrelated with each other, resulting in three intercorrelations.
As described above, the model fit was based off of the Chi-square goodness of fit test, RMSEA, CFI, TLI, and SRMR. All of the fit statistics were greatly improved in comparison with the original factor structure, but did not meet criteria for a good fit. Factors 2 and 3 remained moderately correlated with each other, but neither were significantly correlated with Factor 1 (see Table 6). Modification indices were examined, yet no model modification based on modification indices resulted in a meaningfully improved fit. Model #2 was also run separately for both White and Black youths, but the model fit was not different for either racial group from what was observed in the combined model. .098*** .328*** **p < .05. ***p < .001.

Discussion
The purpose of the current study was to examine the factor structure of the J-SOAP-II, which, to the authors' knowledge, has not yet been examined, to determine the presence of risk-relevant latent constructs. The model based upon the 4 factor model following the structure defined by Prentky and Righthand (2003) provided a poor fit to the data. Furthermore, when the model fit was calculated separately for Black and White youths, the model fit was poor for both, yet was poorer for Black youths. Exploratory factor analysis (EFA) was then used on one half of the dataset to explore other possible factor structures, and then CFA on the other half to test the fit of the new model. Results indicated a three-factor solution that fell short of the criteria for a good fit.
Altogether, findings from the current study indicate that the J-SOAP-II as it is presently structured does not adequately tap into a latent construct. As latent constructs explain the covariances among items, a poor fit could indicate that the items that make up each individual subscale do not sufficiently covary, and so could contribute to the high variability in the predictive validity of the J-SOAP-II (e.g., Fanniff & Letourneau, 2012). The improvement in fit from our newly proposed model indicates that there are at least three latent constructs within the J-SOAP-II worth exploring. Further, due to the inadequate fit observed in the current study, the J-SOAP-II items are likely insufficient to properly tap into these constructs. Correcting this problem would require a new iteration of the J-SOAP-II that rewords or adds entirely new items designed specifically to address the factors. Yet, despite this limitation, our three-factor solution provides a view into the potential latent constructs observed in the J-SOAP-II. Each factor is described below.
The new J-SOAP-II Factor 1 focuses on a youth's sexual offending and victimization history, and reflects five of the items found on the Sexual Drive/Preoccupation scale. Surprisingly, Item 1 (prior legally charged sex offenses) did not load onto Factor 1, though it would be expected to load similarly to two seemingly related items: Items 2 (number of sexual abuse victims), and 4 (duration of sex offense history).
Factor 2 consisted of items which evaluate risk for general delinquency, as the items reflect historical evidence of delinquent behavior, and present lack of support systems and anger management, which are related to delinquency. Additionally, this newly formed Factor 2 appears to represent content from both the initially indicated J-SOAP-II's Impulsive/Antisocial Behavior and Community Stability/Adjustment scales. The merging of these two scales was not surprising considering the high correlation that was found (see Table 4). Moreover, Item 23 (quality of peer relationships) was moved from the Intervention scale to Factor 2, and select items considered related to intervention were removed: Items 9 (caregiver consistency), 16 (history of physical assault and/or exposure to family violence), and 24 (management of sexual urges and desire).
It is worth noting that prior research did not find such a high correlation between the Impulsive/Antisocial Behavior and Community Stability/Adjustment scales, and so it is highly possible that this merge would not be replicated if EFA was run on a different dataset (Martinez et al., 2007;Viljoen et al., 2008). One potential explanation for this could be that all youths in the present study had been convicted and confined for a sexual offense approximately two weeks prior to the assessment, whereas Viljoen et al. (2008) scored the J-SOAP-II retrospectively based on extensive patient records, and approximately 40% of the youths described in Martinez et al. 's (2007) paper were not adjudicated.
Factor 3 taps into the antisocial beliefs and attitudes of the youth, and remained largely unchanged from Scale 3, with only 1 item (quality of peer relationships) from Scale 3 moving into Factor 2. While every item is titled in a prosocial manner (e.g., "Accepting responsibility for offense[s]") a high score always reflects an antisocial response (e.g., "2 = Accepts no responsibility, or there is full denial"). The primary distinguisher from this and Factor 2 appears to be that Factor 3 is focused on individual beliefs, while Factor 2 is focused primarily on the unique history and personal situations influencing each youth.
The newly proposed factor structure was designed to more adequately tap into latent constructs, which does not necessarily imply that the predictive validity was increased. For instance, the item pertaining to a high sexual drive was removed, yet high sexual drive is an important predictor for juvenile recidivism, and the clinical utility of the antisocial personality items is enhanced among youths with heightened sexual drives (Wijetunga et al., 2018). Due to this, the three-factor solution should not be considered a proposed change to the J-SOAP-II. Instead, the results should only be examined so far as to see which questions are tapping into which possible latent constructs. Future research can expand upon our findings to examine if these constructs are predictive of sexual or non-sexual reoffense risk, and if so, propose changes that would result in a version of the J-SOAP that concretely taps into specific latent constructs, and has consistently high predictive validity.
Using factor analysis to identify psychological traits present in actuarial scales for juveniles is fairly uncommon in the research literature specific to juvenile sexual offending. Martin (2017) found four factors in the JSORRAT-II: Sexual Offending History, General Criminality, Familial Abuse, and Educational Disruptions. In comparison to the current study, only the first factor is paralleled. The items making up the remaining 3 factors that Martin (2017) found are all represented in the present study. Sexual Offending and Victimization History includes an item about sexual victimization, which is one item in Martin's Familial Abuse factor. Risk for General Delinquency includes Items 14 (ever charged or arrested before age 16), and 15 (multiple types of offenses) which are similar to the items in General Criminality, and Items 11 (school behavior problems) and 27 (stability in school) maps onto Educational Disruptions (Martin, 2017). Similarly, Rojas and Olver (2019) found three factors in the VRS-YSO: Sexual Deviance, Antisocial Tendencies, and Family Concerns. Sexual Deviance and Antisocial tendencies cover similar items to Sexual Offending and Victimization History and Antisocial Beliefs and Attitudes respectively, while the items in Family Concerns are similar to items in Risk for General Delinquency, though the latter factor is broader. These similarities indicate that the newly proposed three-factor structure for the J-SOAP-II is tapping into similar constructs when compared with other juvenile risk assessments, but perhaps is grouping them into broader categories.

Study Limitations
The results of this study should be viewed through the lens of a few limitations. Our sample was limited to Alabama youths who have been convicted of a sexual offense for which they are confined in a juvenile correctional facility, and so may not generalize to the rest of the U.S., or to youth with sexual offenses in other countries. In addition, the youths in this study were disproportionately low SES and predominantly Black or White. While there was a large sample size for Black and White youths, there was not large enough sample size of youths of any other race/ethnicity, and so were unable to examine how the J-SOAP-II may fit differently for them. Due to having no follow-up data, the predictive validity of the J-SOAP-II was unable to be determined in the present study, which would have allowed for stronger conclusions to be drawn from this study. Finally, there was also only one coder for each administration of the J-SOAP-II, preventing interrater reliability analysis and allowing for a potential source of error.

Implications and Future Directions
Clinicians, particularly those working with youth who have an interpersonal offense, are often required to conduct risk estimates that can shape the trajectory of sentencing for a youth, yet the present tools are lacking in effectiveness. While the poor fit of the J-SOAP-II observed in the present study does not imply anything about its predictive validity, previous mixed findings on predictive validity calls into question the appropriateness of using the J-SOAP-II to determine risk level. The present study's findings of a poor factor structure suggests that the items are not grouped in a way that assesses a particular latent construct, which may contribute to the inconsistent findings in predictive validity (Barra et al., 2018;Fanniff & Letourneau, 2012).
Findings from this study further indicate that clinicians conducting assessments that incorporate the J-SOAP-II should consider the individual J-SOAP-II items that are endorsed to guide their treatment plan, rather than using the scale scores (particularly scale scores derived using the currently recommended scale scoring guidelines) to guide their risk assessment. Using the item grouping found in the present study may improve risk assessment interpretation. Future research should examine different possible factor structures, as well as examine the predictive validity of the three-factor solution proposed here compared to the presently used four-factor solution.
This study sought to provide recommendations for the revision of the J-SOAP-II. Our findings indicate that the present J-SOAP-II is not adequately tapping into a latent construct, and the present items do not allow it to do so. That is to be expected, as it was not designed explicitly to assess a latent construct. Modifications with the explicit goal of assessing risk-relevant latent constructs could result in a scale that more adequately and consistently predicts risk, and provides meaningful treatment goals.
Finally, the effect of race on sexual reoffense risk scales is also of vital importance. Prior research demonstrated that race is significantly related to the J-SOAP-II score (Fix et al., 2017). The present study extends these findings and indicates that race affects the model fit, suggesting future research examining effects of race on J-SOAP-II scoring, interpretation, and evaluation of predictive validity is warranted.

Conclusion
The newly proposed three-factor solution observed in the present study provides a far improved fit, but the questions as they currently stand are not able to be grouped into a factor structure that provides a good fit to this data. Future research should focus on rewording existing items, and adding new items to the J-SOAP-II to better capture risk relevant latent constructs. As the newly observed three-factor J-SOAP-II solution does not have a good fit, and the predictive validity of the J-SOAP-II was not tested in the current study, this should not be considered an immediate change to the J-SOAP-II. However, our findings suggest that clinicians would benefit from considering Scales 2 and 4 to be measuring the same construct, and that the items that were removed in the formation of Model #2 may not be useful to the clinician's and patient's time. Using factor analysis to identify risk-relevant latent constructs can better aid an RNR-focused treatment by promoting item-reduction so that fewer resources are spent on any one assessment, and allowing even static items to contribute to a dynamic treatment target.

Funding:
The authors received no financial support for the research, authorship, and/or publication of this article.

Competing Interests:
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Acknowledgments:
The authors would like to thank the Alabama -Department of Youth Services for their continued support of this research project through many years of service. We would also like to thank Dr. Kelli Thompson and the graduate and undergraduate students of the Juvenile Delinquency Lab at Auburn University for their assistance with data collection.
Author Note: The authors take responsibility for the integrity of the data, the accuracy of the data analyses, and have made every effort to avoid inflating statistically significant results. Data Availability: Data and codes can be made available to select research teams by contacting Dr. Rebecca Fix at rebecca.fix@jhu.edu