Methods notes on the estimation of Population Attributable Fractions • tobalcepi

Summary

This methodology report sets out a description of how we estimate the fraction of disease cases attributable to tobacco and alcohol. It is currently limited to a brief description and critique of the mathematical approach commonly taken.

1 Introduction

Tobacco and alcohol are risk factors for a broad range of health conditions, but the relationship between smoking, drinking and some health conditions is complex. Whilst some conditions, such as alcohol liver disease, are caused solely by alcohol, there are other conditions for which alcohol is a contributory, but not the only, risk factor.

When the risk of having a health condition is increased by exposure to tobacco or alcohol consumption, but tobacco or alcohol consumption is not the only factor that increases risk, we refer to the burden of ill health from that condition as being “partially-attributable” to tobacco or alcohol consumption.

Estimating the proportions of the health burden of conditions that are partially attributable to tobacco smoking and drinking alcohol is important for monitoring the impact of these commodities on society, e.g. for understanding the additional demand for healthcare services that is due to tobacco or alcohol consumption (NHS Digital 2019a, 2019b).

These proportions, known generally as Population Attributable Fractions (PAFs) or specifically Alcohol Attributable Fractions (AAFs), Smoking Attributable Fractions (SAFs) or Alcohol and Smoking Attributable Fractions (ASAFs) (that define the joint burden of disease that is due to the combination of tobacco and alcohol consumption among individuals in society), are defined as the proportion of the cases of a partially attributable disease or injury that would be prevented if exposure to alcohol or tobacco was eliminated (Mansournia and Altman 2018; Rosen 2013).

Table 1.1: Explanation of terms.
Term	Definition
Population Attributable Fractions (PAFs)	The proportional reduction in population disease or mortality that would occur if exposure to a risk factor were reduced to an alternative ideal exposure scenario
SAF	Smoking attributable fraction
AAF	Alcohol attributable fraction
ASAF	Joint alcohol and smoking attributable fraction
Incidence	The incidence of a disease is the rate at which new cases occur in a population during a specified period.
Prevalence	Prevalence is a measure of the frequency of a disease or health condition in a population at a particular point in time (and is different to incidence, which is a measure of the number of newly diagnosed cases within a particular time period).

Table 1.2: Overview of mathematical notation.
Symbol	Description
\(a\)	age in years (0, 1, 2, 3, …)
\(a_{min}\)	youngest age in the model
\(a_{max}\)	oldest age in the model
\(y\)	period in calendar years (2001, 2002, 2003, …)
\(c\)	birth cohort in calendar years i.e. the years of birth of the individuals in our model, \(cohort = y-a\) (1940, 1941, 1942, …)
\(A\)	population size i.e. the number of individuals who are simulated to be alive in year \(y\)
\(u\)	generic index of discrete exposure states, which might be defined variously in terms of current and past consumption of tobacco and/or alcohol.
\(\theta\)	probability density function (used to describe the distribution of the population among discrete states)
\(rr\)	relative risk of disease
\(h\)	disease index
\(X\)	number of new cases of a disease in a year (incidence)
\(P_{\text{incidence}}\)	probability that an individual develops a disease in a year (incidence)
\(z\)	proportional increase in the relative risk of a disease due to an exposure

2 Population attributable fraction derivation

2.1 Preamble

The population attributable fraction (PAF) is the proportion by which the cases of a disease would be reduced if the population were not exposed to a certain factor. In our case, we define the exposure as either tobacco smoking, alcohol consumption, or a combination of the two. The PAF is defined as

\[\begin{equation} PAF = \frac{A_{\text{exposed}}(h)}{A_{\text{total}}(h)} \tag{2.1} \end{equation}\]

where \(A_{exposed}(h)\) is the number of individuals who have disease \(h\) because of the exposure, and \(A_{total}(h)\) is the total number of people with disease \(h\). It should be obvious that for chronic diseases related to tobacco and alcohol, the number of individuals who have the disease in any year \(y\) is strongly influenced by how their history of exposure to tobacco and alcohol consumption over many years has affected disease incidence, recovery, relapse and survival. However, it turns out that it is difficult to incorporate this complex chain of causality into PAF estimates. This means that when we consider the methodology of PAF calculation, part of our aim is to understand how our chosen method might bias the PAF estimates.

2.2 The commonly used formula

There are a range of alternative methods to calculate PAFs (e.g. see a comparison of methods for estimating smoking attributable fractions (Oza et al. 2011)). The method that we consider, which we term the “prevalence-based method”, is based on the use of cross-sectional survey data on tobacco and alcohol consumption. It does not make a direct link between an individual’s tobacco and alcohol consumption and their development of disease using these data. Instead, it makes use of published statistical estimates that show how the relative risk of disease varies with exposure to tobacco and alcohol consumption. It is commonly used to estimate the disease morbidity and mortality attributable to tobacco and alcohol consumption in England (NHS Digital 2019a, 2019b; Tobacco Advisory Group of the Royal College of Physicians 2018; Jones et al. 2008). It is defined for each disease as

\[\begin{equation} PAF = \frac{\sum_{u}\theta(u)[rr(u)-1]}{1+\sum_{u}\theta(u)[rr(u)-1]},\tag{2.2} \end{equation}\]

where \(u\) is an index of exposure, which in our case might be defined in terms of current and past consumption of tobacco and/or alcohol, and \(rr\) is the relative risk of disease. PAFs might be estimated using (2.2) separately for particular calendar years, ages or various definitions of population strata.

Bias notes

Estimates will depend on the extent to which the survey data records each individual’s current and past tobacco and alcohol consumption, and most cross-sectional survey datasets have limited information on individual consumption histories.
Estimates also depend on the nature of the statistical estimates that link individual tobacco and alcohol consumption to their relative risk of disease. Meta-analyses are used where possible, but primary studies vary in influential factors such as: population context; how they have defined exposure to tobacco and alcohol; how they have defined disease end-points, e.g. incidence, prevalence or death; study design, e.g. cohort or case-control; and the factors controlled for in statistical analysis.

2.3 Derivation for chronic disease incidence

The derivation of (2.2) in terms of the incidence of a chronic disease, such as a particular type of cancer, is as follows:

Considering a particular year and age, write the number of new cases of the disease, \(X(h)\), in terms of: population size, \(A\); the proportion of people with each level of exposure, \(\theta(u)\); the relative increase in the risk of disease incidence at that level of exposure, \(rr(u, h)\); and the probability that people who are not exposed get the disease, \(P_{\text{incidence}}(h | u = \text{not exposed})\).

\[\begin{equation} X(h) = A\times{}\sum_{u}P_{\text{incidence}}(h | u = \text{not exposed})\times{}\theta(u)\times{}rr(u, h),\tag{2.3} \end{equation}\]

noting that

\[\begin{equation} rr(u, h) = 1 + z,\tag{2.4} \end{equation}\]

where \(z\) is the proportional increase in risk due to the exposure such that \(z = 0\) for individuals who are not exposed.

Re-write (2.3) in terms of \(z\) to give \[\begin{equation} X(h) = A\times{}\sum_{u}P_{\text{incidence}}(h | u = \text{not exposed})\times{}\theta(u)\times{}[1+z(u, h)].\label{der2} \end{equation}\] For the numerator, expand the brackets and retain only the terms that give the number of new disease cases that are due to exposure \[\begin{align} X_{\text{exposed}}(h) = A\times{}\sum_{u}\theta(u)\times{}P_{\text{incidence}}(h | u = \text{not exposed})\times{}z(u, h). \end{align}\] For the denominator, use the full expanded version, i.e. the total new disease cases, and apply (2.1) to give \[\begin{equation} PAF = \frac{A\times{}\sum_{u}\theta(u)\times{}P_{\text{incidence}}(h | u = \text{not exposed})\times{}z(u, h)}{A\times{}\sum_{u}\left[\theta(u)\times{}P_{\text{incidence}}(h | u = \text{not exposed})+\theta(u)\times{}P_{\text{incidence}}(h | u = \text{not exposed})\times{}z(u, h)\right]}, \end{equation}\] which simplifies to \[\begin{equation} PAF = \frac{\sum_{u}\theta(u)\times{}z(u,h)}{1+\sum_{u}\theta(u)\times{}z(u,h)}, \end{equation}\] and then substituting with (2.4) becomes (2.2).

Bias notes

The PAFs associated with the prevalence and mortality rates from chronic diseases for a particular age will be a function of the PAFs associated with disease incidence for previous ages within the same birth-cohort.
The function will be a weighted average, where the weights indicate the distribution of ages of disease incidence for the individuals who are currently alive and have the disease.
If (2.2) is applied to the prevalence and mortality rates from chronic diseases, then this makes the strong assumption that exposure is constant across ages within a birth-cohort.

2.4 Protective effects

For some diseases, the relative risks indicate that some levels of exposure to tobacco or alcohol consumption protect against the disease, i.e. people who smoke or drink are less likely to get the disease. Protective effects are defined by \(rr(u,h) < 1\), which when (2.2) is applied results in negative PAFs. Negative PAFs can be understood by defining the relationship between the observed lower number of disease cases with the exposure \(A_\text{observed exposure}(h)\) and the hypothetically higher number of disease cases without the exposure \(A_\text{no exposure}(h)\) as:

\[\begin{equation} A_\text{no exposure}(h) \times{} [1+PAF] = A_\text{observed exposure}(h). \end{equation}\]

3 Single substance and joint substance attributable fractions

The population attributable fraction (PAF) is the proportion by which the cases of a disease would be reduced if the population were not exposed to a certain risk factor. In the STAPM modelling, the exposure considered is either only alcohol consumption, only tobacco smoking or a combination of the two. When both alcohol consumption and tobacco smoking are considered, the population attributable fraction represents the proportion by which the cases of a disease would be reduced if the exposure of the population to the effects of both tobacco and alcohol consumption were removed. If this were to happen, then it would mean that people who both smoke and drink would have to stop both behaviours, and people who smoke but do not drink, and vice versa, would have to stop one behaviour.

It is commonly observed that the joint alcohol and smoking attributable fractions (ASAFs) are smaller than the sum of the single substance attributable fractions, the AAFs and SAFs. In general terms, this is because the substance specific attributable fractions combine to give the joint attributable fraction according to the formula (Mant and Hicks 1995; Ezzati et al. 2003):

\[\begin{equation} ASAF = 1 – (1 - AAF) \times (1 – SAF) \end{equation}\]

According to this formula, if 20% of cases of a disease could be prevented by removing exposure to alcohol, and 20% of cases of the same disease could be prevented by removing exposure to tobacco smoke, then we would expect that 36% rather than 40% of cases of the disease would be prevented by removing exposure to both substances.

The reduction of the ASAF in relation to the sum of the single substance attributable fractions is larger when a larger proportion of the population both drink and smoke, i.e., when there is a larger positive correlation between individuals’ tobacco and alcohol consumption. The reduction of the ASAF in relation to the sum of the single substance attributable fractions implies that preventing some cases of a disease that is related to both tobacco and alcohol in people who consume both substances might necessitate the removal of the exposure to both tobacco and alcohol consumption, i.e. removing just one exposure is insufficient. However, the single substance AAFs and SAFs assume that removing exposure to just one substance is enough, without considering the competing effects of multiple exposures. The ASAF is therefore a more accurate estimate of the joint burden of disease (Ezzati et al. 2002).

In light of this, adjusted versions of the single substance attributable fractions can be calculated in which they are scaled downwards to approximate the cases of a disease related to both tobacco and alcohol that might be prevented if exposure to only one substance were removed.

Difference between the sum of the single substance attributable fractions and the joint attributable fraction:

\[\begin{equation} d = ((AAF + SAF) – ASAF). \end{equation}\]

Adjusted single substance attributable fractions from which half of the above difference has been subtracted:

\[\begin{equation} AAF_{adjusted} = AAF – 0.5 \times d, \end{equation}\]

\[\begin{equation} SAF_{adjusted} = SAF – 0.5 \times d. \end{equation}\]

Using the adjusted single substance attributable fractions means that

\[\begin{equation} ASAF = AAF_{adjusted} + SAF_{adjusted}. \end{equation}\]

Using the adjusted single substance attributable fractions can help to communicate the accumulated burden of disease from exposure to multiple risk factors.

References

Ezzati, Majid, Alan D Lopez, Anthony Rodgers, Stephen Vander Hoorn, and Christopher JL Murray. 2002. “Selected Major Risk Factors and Global and Regional Burden of Disease.” The Lancet 360 (9343): 1347–60.

Ezzati, Majid, Stephen Vander Hoorn, Anthony Rodgers, Alan D Lopez, Colin D Mathers, and Christopher JL Murray. 2003. “Estimates of Global and Regional Potential Health Gains from Reducing Multiple Major Risk Factors.” The Lancet 362 (9380): 271–80.

Jones, Lisa, Mark A Bellis, Dan Dedman, Harry Sumnall, and Karen Tocque. 2008. “Alcohol-Attributable Fractions for England: Alcohol-Attributable Mortality and Hospital Admissions.” Liverpool: Centre for Public Health, Liverpool John Moores University. https://fingertips.phe.org.uk/documents/alcoholattributablefractions.pdf.

Mansournia, Mohammad Ali, and Douglas G Altman. 2018. “Population Attributable Fraction.” BMJ 360. https://doi.org/10.1136/bmj.k757.

Mant, Jonathan, and Nicholas Hicks. 1995. “Detecting Differences in Quality of Care: The Sensitivity of Measures of Process and Outcome in Treating Acute Myocardial Infarction.” Bmj 311 (7008): 793–96.

NHS Digital. 2019a. “Statistics on Alcohol: England.” https://digital.nhs.uk/data-and-information/publications/statistical/statistics-on-alcohol.

———. 2019b. “Statistics on Smoking: England.” https://digital.nhs.uk/data-and-information/publications/statistical/statistics-on-smoking/statistics-on-smoking-england-2018/content.

Oza, Shefali, Michael J Thun, S Jane Henley, Alan D Lopez, and Majid Ezzati. 2011. “How Many Deaths Are Attributable to Smoking in the United States? Comparison of Methods for Estimating Smoking-Attributable Mortality When Smoking Prevalence Changes.” Preventive Medicine 52 (6): 428–33. https://doi.org/10.1016/j.ypmed.2011.04.007.

Rosen, Laura. 2013. “An Intuitive Approach to Understanding the Attributable Fraction of Disease Due to a Risk Factor: The Case of Smoking.” International Journal of Environmental Research and Public Health 10 (7): 2932–43. https://www.mdpi.com/1660-4601/10/7/2932/pdf.

Tobacco Advisory Group of the Royal College of Physicians. 2018. “Hiding in plain sight: Treating tobacco dependency in the NHS.” https://www.rcplondon.ac.uk/projects/outputs/hiding-plain-sight-treating-tobacco-dependency-nhs.