Methodology

A note with full technical detail on PASCAL's methodology can be found here.

How does PASCAL make its suggestions?

PASCAL’s aim is to help local authority commissioners make the best use of their Best Start in Life budget. It does so by identifying which of the 20+ early years interventions included in the tool are most likely to deliver the biggest impact, and which types of family should get each intervention.

When we break down the varying needs of families and children across the country, and which programmes are best evidenced to support those specific needs, there are billions of possible combinations of programmes that could be offered in different locations. These need to be compared for their possible overall impact on the Good Level of Development. It isn’t possible to do this manually, but our modelling can handle these comparisons at scale and at pace. The output from these models finds the portfolio of interventions that is likely to maximise the impact of a given area’s Best Start in Life (BSiL) funding on GLD, based on available information.

In order to do this, our model must give each intervention a score – which we call ‘marginal social welfare’ – for each type of family in your local area, identified using data from the National Pupil Database. This is done by calculating the likely impact on GLD at scale if the intervention is offered to each kind of family in the local population, and then giving more weight to interventions that match users’ preferences so that they are more likely to appear in the suggested portfolio.

This calculation is the result of a mathematical formula that is applied in the same way to each intervention and group of people. To spell out the logic, we can split it into its three main components. These are three of the ‘tags’ you see reported under each intervention in your portfolio.

  • Predicted impact on GLD

    Nesta developed a statistical model to predict impact on GLD when direct impact on GLD has not been measured in formal evaluation. It takes evidence from evaluations, like randomised-controlled trials, and translates it into the most likely impact on the Good Level of Development measure by drawing on known relationships between the outcomes that have been measured in evaluations and GLD. Note: all modelling is based on predicted impact when the intervention is implemented with ‘high fidelity’ (to a high standard).

  • The quality of evidence

    Not all evidence is created equal. Some evaluation methods offer a much higher quality of evidence than others. However, our model’s estimates assume the effect seen in the evaluation is going to be replicated in the real world. This is very unlikely to be the case, because evaluations often occur under favourable conditions and because some methods of evaluation (other than RCTs) can provide misleading results.

    For this reason, the quality of evidence underpinning the intervention’s evaluation – based on the Foundations Guidebook rating – is also a critical input to PASCAL. The Guidebook rating lets us build an estimate of the likelihood that an evaluation’s findings will replicate in the real world.

    We use academic research (‘meta-analyses’) to penalise less robust evaluations by estimating the likelihood that their findings will fail to replicate in the real world. For example, we estimate that an intervention that has only had one RCT has about a 60% chance of replicating its effects in another RCT (Camerer et al., 2018), but we estimate that an intervention that only has quasi-experimental evidence is 60% as likely to replicate as an RCT (Salcher-Konrad et al., 2024).

  • The appeal to parents

    Finally, you can’t achieve impact without parents signing up to the programmes you implement, so the appeal to parents of each programme is critical. PASCAL accounts for this by multiplying the share of parents who said they would sign up for each programme in each demographic group from a rigorous survey experiment we ran with over 2,000 parents.

Each of these three elements – impact, real-world replication, and predicted sign-up – are critical to generating impact. Fortunately, they can be combined using a neat mathematical formula that tells us the statistically expected impact on GLD among different sets of families for each intervention. This is the ‘marginal social welfare’ score for each intervention.

Finally, PASCAL adjusts the ‘marginal social welfare’ score to reflect the priorities selected by the user. Interventions matching your priorities will end up with a higher score after this adjustment is done, meaning they appear more often in your portfolio. PASCAL does this because we believe it’s important to account for your local knowledge when it makes its suggestions.

How we modelled the impact of each programme on the Good Level of Development measure

A major challenge faced by the market for parenting support interventions is that we don’t know the impact of these interventions on the same outcome measurement, making it very hard to know which programme is likely to deliver the biggest impact on children’s outcomes at scale. Instead, programmes that have received robust evaluations are tested on a range of different measures, normally questionnaires on socioemotional difficulties or cognitive assessments.

To enable PASCAL to compare interventions’ impact on the same scale, we built our own statistical model, a variation of a ‘structural equation model’. This model enables us to predict the impact of each parenting intervention on the ‘Good Level of Development’ (GLD) measure in England. To do so, we used a large, rich dataset (the Millennium Cohort Study) which tracked thousands of children and measured both their GLD scores and other outcomes used in many evaluations of parenting interventions, like the Strengths and Difficulties Questionnaire (SDQ) and the British Ability Scales II (BAS II).

We set up our model to make the best of this data, leveraging our team’s expertise on child development. It extracts the "common threads" – the deep statistical relationships – between all these different measures. These “common threads” are what are judged to be impacted by parenting interventions.

By extracting these common threads and their statistical relationships, it can take the evidence from an evaluation (e.g., "Intervention X improves SDQ scores by 5%") and translate it into the most likely predicted impact on GLD. This allows us to compare all interventions on the same "like-for-like" basis, even when they were originally tested on different outcome measures.

Our survey of over 2,000 parents to learn about their preferences

When developing PASCAL, we recognised that a programme’s "impact" on paper doesn't matter if parents don't actually sign up for it. This means the appeal to parents is just as critical as the evidence or the impact score. However, no existing dataset was available to consistently compare the popularity of dozens of different parenting programmes.

For this reason, we designed and ran a large-scale survey experiment with over 2,000 parents across the UK. Instead of just asking what kinds of interventions parents liked the most, we used a state-of-the-art method called a "discrete choice experiment."

We showed each parent a small, random set of interventions (5 at a time) and provided an accessible summary of their key features, for example:

  • Is it online or in-person?
  • Is it in a group or one-on-one?
  • How many sessions does it have?
  • Can you bring your child to the session?

We then asked them to choose the one they would most likely sign up for and attend. By repeating this thousands of times with different parents and different combinations of programmes, our model understands the relative popularity of every single intervention. More importantly, it can say why parents prefer certain programmes, showing that convenience (like being online) and childcare provision are relevant to parents’ preferences. The "parent appeal" score in PASCAL comes directly from this data, ensuring its portfolio suggestions reflect parents’ voice.

Because our large sample and experimental design give us such a rich dataset, we can also reliably predict how the popularity of different programmes varies across demographic groups. For example, we can say that a targeted programme impacting socioemotional development is likely to be more popular among parents with concerns about the socioemotional development of their child.

Limitations

In order to let PASCAL use a consistent quantitative approach to suggest portfolios, we’ve made some simplifying assumptions. These assumptions have used our best judgement, drawing on our experience working in this sector and the best evidence available to us, but we want to be transparent about them so you can balance PASCAL’s suggestions with other information to help you reach your decisions.

  • PASCAL doesn’t know what can be procured or implemented in your local context

    For this reason, we suggest that you use PASCAL to gather information on the possible value for money of relevant parenting interventions in the early stages of planning to commission. However, you should always couple its suggestions with your own research on what interventions are most appropriate and can be feasibly delivered in your local context.

  • PASCAL uses its best estimates, but there is uncertainty around them

    For every input to PASCAL’s calculations, we use the best estimate available. However, because there is some uncertainty around the inputs to PASCAL, as well as the modelling it uses to produce its suggested portfolios, there is also uncertainty around what it decides is the optimal combination of programmes. For this reason, we suggest looking at PASCAL’s suggestions of alternative programmes and carefully considering these programmes, too.

  • Programme costs in PASCAL are only approximate

    We built PASCAL to suggest portfolios of interventions that are feasible to implement within a particular budget, given approximate data on programme costs per family. However, there are significant uncertainties in the data we used for this purpose. For example, in many cases we had to draw on Foundations’ estimates of unit costs, which are based on modelling of the following factors:

    • Training fees and time needed to train practitioners
    • Requirements for follow-up or booster training
    • Costs of initial and ongoing intervention materials
    • Practitioner hours required for delivery
    • Qualification levels of practitioners and supervisors
    • Internal and/or external supervision needs
    • Licensing fees
    • Typical group size for delivery

    However, it’s likely that these cost figures miss some important inputs and may not account for economies of scale as interventions are delivered to more families. We have checked our information on costs with programme developers where possible and the PASCAL tool uses the best information on cost that we have available.

  • PASCAL doesn’t account for constraints on your facilitator workforce

    We make the assumption that the upper limit on the delivery of a given programme is the number of families who may be willing to sign up. We are aware that this number can be substantially higher than the number of families currently reached by parenting interventions in many local authorities and may be beyond the capacity of your current workforce. We chose to take this approach because PASCAL is intended for use with Best Start in Life funding, part of which can be used to expand the workforce available to deliver parenting interventions.

  • Some of PASCAL’s analysis uses old data and some additional assumptions

    Our model of the impact on GLD uses data from the Millennium Cohort Study (MCS), a large and rich longitudinal dataset. The main weakness of the MCS is that the relevant data were collected in 2005, meaning that they use an old definition of the Early Years Foundation Stage Profile.

    We take several steps to ‘harmonise’ the old data with the newest definition of the Good Level of Development, in particular so the share of children meeting a GLD matches that in the latest available data. Additionally, we have evidence that the key relationships in the data have not changed much over time.

    We use National Pupil Database data to calculate the shares of children with developmental needs in each local authority. It is worth noting that we measure developmental needs at age 5, through the shares of children meeting Early Learning Goals, and use them as approximations to the shares of children with these developmental needs at age 3. This approach is justified by the fact that developmental needs at age 3 are highly predictive of needs at age 5.

We have taken this approach because this is the most robust data available on which to base our analysis.