Different dimensions form the basis for evaluation and classification of individual interventions in Ungsinn’s articles. How well the intervention is described, it’s theoretical foundation, results from efficacy or effectiveness evaluations, if applicable, and the intervention’s implementation strategies determine which evidence level the intervention receives in Ungsinn. In the articles published after November 2015, more of these dimensions are scored.
These dimensions either provide information about the effects of the intervention or information that is vital to the development of such knowledge. The various aspects build naturally upon one another. A good description is a prerequisite to being able to work out a theoretical foundation for the anticipated mechanisms of the intervention. Both the description and the theoretical foundation provide the basis when the possible effects of an intervention are examined. Thereafter, a strategy that ensures good quality of implementation and dissemination of the intervention are necessary to ensure that the effects found through research are maintained when the intervention is put to use in ordinary practice (Fixen et al. 2005; Greenhalgh, Robert, Macfarlane, Bate, & Kyriakidou, 2004; Durlak & DuPre, 2008; Meyers et al., 2012; Sørlie et al., 2010).
Scoring the quality of the description
Various aspects of the description of the intervention are assessed on a three-point scale from «not described» to «well described». These are indicated in Table 1.
Table 1. Schematic overview of description quality scoring
|Dimensions in the description||Not described||Somewhat described||Well described|
|Description of the problem|
|Design of the intervention|
|Core elements / flexibility|
|Manual/guide for practitioners|
|Materials for intervention users|
|Studies that strengthen the description|
Description of the problem:
The problem area is described including, for example, scope, risk factors, consequences of the problem, co-variation with other problems or risk of developing other issues.
The target group is described including relevant characteristics. Inclusion and exclusion criteria are provided.
The description clearly presents the intervention’s main goal(s).
The description clearly presents the intervention’s secondary goal(s), if applicable. This may include, for example, a reduction of risk factors or increase in protective factors, or it may be objectives that are viewed as less significant than the main goal.
Design of the intervention:
It is clear from the description how the intervention is organized and structured; for example, whether it will be offered in groups or individually, who the providers are, the duration, frequency, thematic structure, order, progression and location of activities.
The methods employed to reach the desired goals are described in as much detail as possible; for example, do the methods include cognitive techniques, behavior therapy or psychoeducation, and whether or not the methods involve practical exercises, homework, video or role playing.
Core elements and flexibility:
The description addresses what is considered as absolutely required core elements in the practice of the intervention and in which areas there is room for flexibility.
It is specified who may offer the intervention. This may include which professional groups may deliver the intervention, what training is necessary and through which agencies/organizations the intervention may be provided.
There is a manual/guide for practitioners that describe in detail how the intervention should be delivered.
Materials for intervention users:
There are materials for users of the intervention.
Scoring of quality of research methodology
For each study, five methodological aspects are scored as outlined in Table 2. Each aspect is evaluated on a scale from 0 to 4. The scoring of each study is summarized with a point score for each aspect, an average score and a comprehensive description.
Table 2. Scoring of quality of research methodology
|Study||1. Statistical analyses||2. Measurement||3. Internal validity||4. Fidelity||5. External validity||Average|
Note: The scale used is: 0 = not reported or investigated, 1 = poor/unsatisfactory, 2 = satisfactory, 3 = good, 4 = very good.
1. Statistical analyses
Here, it is evaluated whether the statistical analyses are adequate and whether the necessary prerequisites for applying these are fulfilled. It is assessed whether the study had sufficient statistical power and whether or not dropout analyses were performed. The statistical analyses must be appropriate to the current design used in the study. Additionally, effect sizes and relevant significance tests must be reported, or results that allow for these to be calculated. The significance level should be at a minimum of p< .05 and the effects in the expected direction. There may not be any significant negative effects on main outcome variables.
2. Measurement (reliability and validity)
Here, the measurement’s reliability and validity are assessed with an eye on the stated outcomes of the intervention. It is preferred that well tried measurement instruments are used and, where reliability and validity are examined, preferably carried out on Norwegian or Nordic samples. See Psyktestbarn.no for a further description of how the quality of tests should be examined. Corresponding requirements apply to other measurement methods such as observations and interviews; i.e., that they are reliable and valid. It is also preferable that those who assess outcome variables are blind to which interventions or control conditions the participants have received. This is to avoid that the results may be influenced by such knowledge.
3. Internal validity
The likelihood that there are no threats to the internal validity of the study must be discussed (that the intervention is the reason for the change rather than other factors). In general, the strongest design is a real experiment or RCT when the mission is to determine whether or not the intervention has an effect. However, there are other factors that may threaten the internal validity of such a design; e.g., large dropout or differential dropout in the two groups. Another source of error is diffusion between the experiment and the control conditions; e.g., that participants under the control conditions try to gain access to the same intervention that the intervention group has received.Quasi-experimental design can have various forms of control/comparison groups. In its simplest form, the test persons themselves make up the control condition; e. g., in the form of a pre-/post-test design. With such a design there are many threats to the internal validity, such as history (other external factors that have caused the change) or maturation (the individuals change as a result of time passing rather than due to the intervention).
Quasi-experimental design may have various types of comparison groups. These may include groups selected because they are similar to the intervention group, while other times it may be more accidental and, in the worst case, the result of self-selection. In assessing the design, it is therefore important to know how the groups have been selected, whether possible pre-test differences have been investigated and whether this has been accounted for in the analyses. Various other threats to internal validity are described by Shadish, Cook and Cambell (2002). Not all threats are relevant in all contexts and, therefore, must be assessed according to what is relevant for the individual study. Other examples of quasi-experimental designs that may be suitable for use in effectiveness studies include cohort studies or «stepped wedge trials» (SWT, Brown, & Lilford, 2006). In SWD the intervention is presented to all participants or groups of participants, but at different time points. This may be an appropriate way of proceeding in situations where, for practical or ethical reasons, it is problematic to leave someone out when offering the intervention. At the end of the intervention period, everyone has received the intervention, while the time point at which this occurs is random. With a longitudinal cohort design, which may be used to evaluate school-based interventions, among others, the intervention is presented to the entire school and to all grades. After some time (for example, one year after initiation of the intervention), a given class grade is compared with the same grade from the pre-test time point (Olweus, 2005).
This point deals with the degree to which it has been ensured that the treatment/intervention is delivered in line with its intentions and, similarly, in the same way to all participants. One way of ensuring this is through manuals and training for those who will deliver the intervention and by establishing systems that assure quality during the course of the study; e.g., through checklists or video analyses (see, for example, Mowbray, 2003).
5. External validity
External validity is understood as the degree to which the results from the study may be generalized to real life or to other target groups and over time. In this point, it is evaluated whether the study has been carried out under the same conditions as will be provided in general practice, among other things, and to what degree follow-up studies have been done to see if the effects are maintained over time; e. g., after 6 months, 12 months or longer. What is considered as a reasonable follow-up period depends on the type, scope and objectives of the intervention and the age of the children. Corresponding to generalization to other settings and target groups, studies illuminating such generalization are required.The five methodological aspects assessed for each study are summarized in Table 3 in the form of control questions that the Ungsinn author must evaluate.
Table 3 Overview of methodological aspects evaluated for each study
|1. Statistical analyses|
|Which analyses are performed to determine effects?|
|Are the analyses adequate and assessed as being in line with the current design being used?|
|Are the statistical prerequisites for the analysis fulfilled?|
|Does the study have sufficient statistical power?|
|Is dropout/missing reported and evaluated?|
|Have dropout analyses been done and, when appropriate, ITT analyses?|
|Are effect sizes, or results that may be converted to such, reported?|
|2. Measurement (reliability and validity)|
|Is relevant reliability for the various measurement instruments reported based on the current selection (e.g., in the form of a Cronbach’s alpha)?|
|Is the reliability sufficient, particularly on the central outcome variables?|
|Do measurement instruments have good construct validity (i.e., are well-established instruments being used or the validity documented in another way)?|
|Have many informants been included (e.g., children, parents, teachers)?|
|Norwegian/Nordic suitability: Have measurement instruments employed been developed in other countries and translated to Norwegian? If so, what is the quality of this work and how well documented is it that the psychometric properties are adequate for the Norwegian version?|
|In the case of interview or observation data, has the reliability and validity been investigated?|
|3. Internal validity – causality|
|Which design has been used (e.g., RCT, quasi-experiment, cohort)?|
|How has the randomization been carried out (if RCT)? For quasi-experimental design with comparison group, what has been done to ensure that the groups are as similar as possible?|
|Can there be diffusion between the conditions?|
|Have sources of biases, such as history, maturation/aging, testing, instrumentation, statistical regression or dropout been evaluated?|
|To what degree has the possibility for sources of biases been assessed and discussed; and what, if anything, has been done to eliminate such possibility?|
|Are there manuals and training for those who perform the intervention?|
|Are quality assurance procedures, ensuring that the intervention will be carried out in line with its intention and equally for all participants, reported in the study?|
|5. External validity|
|Have several studies been performed, including follow-up investigations; and, if so, over what period of time?|
|How well does the sample selection correspond to those to whom the effect is generalized (i.e., representation in relation to age, gender and symptoms)?|
|Has the intervention been performed in general practice or under regular conditions, such as it is expected to be delivered in the future?|
Scoring of implementation quality
Different aspects of the intervention’s quality assurance systems are evaluated and scored using a form. Good quality assurance systems are expected to promote high quality of implementation. Each category is listed in Table 4 and assessed in relation to whether it is acceptable or not; alternatively, that it is not relevant to the particular intervention. The evaluation is summarized with a total score (total acceptable/total possible).
Table 4. Evaluation of the intervention’s systems to promote good implementation quality
|1. Implementation support|
|2. Qualification requirements|
|4. Certification procedures|
|5. Monitoring of fidelity/long-term delivery|
|7. Identification of target groups|
|8. Documentation and sustainment tools|
|9. Strategies for adaptation|
|Total score||Sum of total possible
For ex. 5 yes/9 possible
1. Implementation support.
The provider offers support to organizations that will implement the intervention and to practitioners who will deliver the intervention. The support will either be offered by a provider organization or by the intervention owner. Implementation support may be in the form of information and preparing meetings, training seminars, booster sessions or the like. The criterion for acceptance: the provider gives implementation support to the organization/agency. There should be a description of the provider’s obligations to the recipient organization.
2. Practitioner’s qualifications.
There is a minimum requirement for educational level and experience (e. g., number of years in the practice field or work experience with children and youth) in order to deliver the program in an effective way. This applies to practitioners, counselors, mentors, trainers and other relevant roles. The criterion for acceptance: the intervention description clearly presents the prior qualifications that are needed to practice the intervention.
3. Training in the intervention.
Training in use of the intervention is provided and the following aspects must be described: duration of the training, training methods and frequency of training required to deliver the program. The training must focus on core components of the intervention, meaning the parts of the intervention that are necessary for it to have the intended effect. In a good quality assurance system, the training is described in detail, whereby skills and knowledge that should be achieved, content, scope, teaching methods, course instructor qualifications and learning materials are presented. Criterion for acceptance: training in the intervention is provided and the scope and content of the education is well described.
3. Certification process
There are requirements for formal competence to be able to deliver the intervention, and there is a certification process that assures the quality of the competence. For example, certification may be given based on training of a certain duration and scope in addition to follow-up and further studies to maintain the competence over time. Certification may also be based on testing of the practitioner’s skills. Criterion for acceptance: there is a certification process for those who deliver the intervention, and the procedures for such certification are presented in the intervention description.
4. Quality assurance of fidelity
There are systematic ways to register sustainability of fidelity to the core components of the intervention such as is anticipated by the program developer. For example, this may include scoring of video of the practitioner at work providing the intervention, feedback through questionnaires or checklists filled out by the practitioner. Criterion for acceptance: the provider systematically monitors the quality of the intervention delivery along with follow-up based on this, if needed..
is provided during delivery of the intervention, after training is completed. This comes in addition to any usual supervision that is offered at the workplace. The description of such guidance may include duration and frequency of intervention-related guidance, the counselor’s role during implementation of the intervention and competence requirements for the counselor. Criterion for acceptance: the intervention has described systems for guidance of practitioners.
6. Identification, screening and recruitment of target groups for the intervention.
The target group for the intervention is precisely specified through inclusion and exclusion criteria for participation and recommended methods for recruitment have been formulated. This may also include the recommendation of instruments for screening or investigation. Children who score above the defined criteria for mental health problems or children with serious cognitive deficiency may serve as examples of meeting the inclusion criteria. Criterion for acceptance: inclusion and exclusion criteria for the target group are precisely specified in the description and there are recommended methods for recruitment of the children/youth/families that are expected to benefit from the intervention.
7. Guidelines for data collection and tools for maintaining effects
There are instruments that the practitioner may use to follow the development of the individual client/user/child/family in order to examine whether he or she is benefitting from the intervention, and is satisfied with it. Criterion for acceptance: there are instruments that may be used to register the benefits of the intervention on an individual level.
8. Strategies for adaptation of the intervention
There is knowledge on the degree to which and how the intervention may be adapted to different target groups, agencies/organizations (i.e., will an intervention that has been tried out in school health services also be able to be offered in child welfare services?) and in other cultural contexts without reducing the effectiveness of the intervention. The reasoning for generalization and adaptation of the intervention should be grounded in empirical evidence. Criterion for acceptance: there is a description of the arenas where the intervention may be offered.
Brown, C. A., & Lilford, R. J. (2006). The stepped wedge trial design: A systematic review. BMC Medical Research, 6:54. doi: 10.11861/1471-2288-6-54.
Durlak J. A., & DuPre E. P. (2008). Implementation matters: A review of research on the influence of implementation on program outcomes and the factors affecting implementation. American Journal of Community Psychology, 41, 327-350. doi: http://dx.doi.org/10.1007/s10464-008-9165-0.
Fixsen, D. L., Naoom, S. F., Blasé, K. A., Friedman, R. M., & Wallace, F. (2005). Implementation research: A synthesis of the literature. Tampa: University of South Florida.
Meyers, D. C., Durlak, J. A., & Wandersman, A. (2012). The Quality Implementation Framework: A synthesis of critical steps in the implementation process. American Journal of Community Psychology, 50, 462-480. doi: 10.1007/s10464-012-9522-x.
Mowbray, C. T. (2003). Fidelity criteria: Development, measurement, and validation. American Journal of Evaluation, 24, 315-340.
Olweus, D. (2005). A useful evaluation design, and effects of the Olweus Bullying Prevention Program. Psychology, Crime, and Law, 11, 389-402.
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton Mifflin.
Sørlie, M-A., Ogden, T., Solholm, R., & Olseth, A. R. (2010). Implementeringskvalitet – om å få tiltak til å virke: En oversikt. [Implementation Quality- Ensuring that interventions work: An overview]Journal of the Norwegian Psychological Association, 47, 315-321.
The text above is taken and translated from Martinussen, M., Reedtz, C., Eng, H., Neumer, S. P., Patras, J., & Mørch, W.T. (2016). Ungsinn – kriterier og prosedyrer for vurdering og klassifisering av tiltak. [Ungsinn – Criteria and procedures for evaluation and classification of interventions]. Tromsø: UiT The Arctic University of Norway