Critical Assessment of An Economic Evaluation of a Healthy Lifestyle Intervention for Chronic Low Back Pain (LBP)

Background: Globally, chronic low back pain (LBP) contributes significantly to the overall burden of disease, placing a heavy load on society through absenteeism and associated healthcare costs. Finding cost-effective measures to treat and prevent low back pain is therefore of utmost importance. Methods: A critical assessment of the study by Williams et al 2018 was performed by using a variation of the well-known Drummond’s checklist for the critical appraisal of economic evaluations. Results: The authors performed appropriate statistical analyses using the available data. Means and proportions of baseline characteristics of the intervention group were compared to those of the control group to evaluate their comparability. Conclusion: Upon thorough assessment of the appropriateness of the economic evaluation methods used by Williams et al., it is conclusive that the validity of their results is valuable and trusted to a degree, soundly achieving many of the listed Drummond et al requirements, yet failing to take into account a few aspects that grant some weaknesses to the study.


Introduction
The amount of health-related research has experienced a notable growth in the past few decades. Methodological differences, as well as other reasons, have led healthcare providers and other decision makers encounter difficulties when choosing the best alternative to accomplish their goals. Since favoring a particular option also carries an opportunity cost [1], decisions should be made in an informed manner and not be left solely to chance. Economic evaluations, defined as the "comparative analyses of alternative courses of action in terms of both their costs and consequences," [2] serve this exact purpose. Economic evaluations may face certain challenges (e.g. technical, ethical), however, they remain a useful framework to elucidate the best way of allocating scarce resources available. In addition, a well-carried out economic evaluation can bring a higher degree of transparency and accountability in the decision-making process by allowing for an evaluation of the underlying judgement in those decisions. However, since their quality depends on the choices made by those performing them, it is important to critically assess their validity.
Globally, chronic low back pain (LBP) contributes significantly to the overall burden of disease [3], placing a heavy load on society through absenteeism and associated healthcare costs [4]. Finding cost-effective measures to treat and prevent low back pain is therefore of utmost importance.
Williams et al performed an economic evaluation of a randomized controlled trial focusing on a healthy lifestyle intervention in overweight and obese individuals with chronic low back pain [4]. The intervention consisted of a brief advice telephone call, a one-hour-consultation with a physiotherapist, and a referral to a 6-month health coaching service provided over the phone. The researchers investigated whether the lifestyle intervention was more cost-effective than usual care and found that the former could be cost-effective for quality-adjusted life years (QALYs) from the societal perspective.

Aim
The aim of the current paper is to evaluate the appropriateness of the methods used in the study by Williams et al, and the validity of their results through defining the study's strengths and weaknesses. This will help determine the usefulness of the study in making decisions or planning further analysis.

Methods
A critical assessment of the study by Williams et al 2018 [4] was performed by using a variation of the well-known Drummond's checklist for the critical appraisal of economic evaluations [2]. All 33 items on the checklist were reviewed and their relevance was discussed. Points deemed irrelevant in the assessment of the study were excluded. The final list was compiled collaboratively and provided a framework for a critical review of the study.

Results
The authors presented a clear research question. They stated that the purpose of their study was "to undertake an economic evaluation of [a] healthy lifestyle intervention, compared with usual care". In addition, they provided a description of how the randomized controlled trial was carried out, outlining the processes of recruitment, assignment to treatment, and intervention. To put it briefly, patients that satisfied the criteria were randomly assigned into either the treatment or usual care (control) group. The latter could be considered as the do-nothing alternative. As previously mentioned, the intervention consisted of telephone advice, a consultation with a physiotherapist, and a referral to a 6-month healthy lifestyle coaching service provided over the phone.
With regard to the costs and consequences, the authors provided an acceptable amount of detail for each alternative. In this study, they dealt with three major categories of costs: intervention, healthcare utilization, and absenteeism. It should also be mentioned that "[all] costs were converted to Australian dollars 2016 using consumer price indices" [4]. Moreover, the cost of the intervention was micro-costed and made up of three elements: (1) cost to provide the advice over the phone; (2) cost of a one-hour physiotherapy session; and (3) cost for a specialist to conduct a telephone-based healthy lifestyle coaching session multiplied by the number of calls each participant received. However, the authors fail to clarify how they estimated the development and operational costs of these calls. Meanwhile, healthcare utilization costs included costs of medical services or medication(s) used by the participants to manage their low back pain. The participants had to recall the services and medication(s) they had used during the past 6 weeks at two time points (at 6 and at 26 weeks follow-up). Assuming linearity, the average cost of the two time points was used to estimate the healthcare cost over the entire duration of the study. Lastly, absenteeism was calculated based on the number of days the participants recalled not going to work due to their low back pain. The cost of absenteeism was also calculated through extrapolation. The authors did not explicitly identify capital costs. Furthermore, discounting was not performed because the follow-up period of the trial was less than a year.
Costs were included or excluded in the statistical analysis depending on the perspective from which it was conducted. The primary analysis, conducted from the societal perspective, included all cost categories mentioned above while, in the secondary analysis conducted from the healthcare perspective, the cost of absenteeism was excluded. The authors did not perform an analysis from the patient perspective in this study.
In the economic evaluation, the authors divided the consequences, or effects, into primary and secondary outcomes. The primary outcome was QALYs while the secondary outcomes consisted of pain intensity, disability, weight, and BMI. In all analyses, all outcomes were included. These outcomes were enumerated according to self-reported data recorded at baseline and two subsequent time points. The exception to the enumeration method was height, which was only recorded at baseline. Quality of life was assessed using the 12-item Short Form Health Survey. This measurement was converted into a utility score using the British tariff, and this score was multiplied by time to give rise to a QALY. Back pain intensity and disability were also enumerated using validated instruments: Numerical Rating Scale and the Roland Morris Disability Questionnaire, respectively.
The authors performed appropriate statistical analyses using the available data. Means and proportions of baseline characteristics of the intervention group were compared to those of the control group to evaluate their comparability. Missing data on costs and consequences were handled through multiple imputation by chained equations. Ten complete datasets were created to ensure that the loss-of-efficiency was below 5%. Each data was analyzed separately and then the pooled estimates were calculated using Rubin's rules, taking into account both the uncertainty within a dataset and that due to missing data. In addition, seemingly unrelated regression analyses were performed to enumerate the cost and effect differences for all outcomes. Importantly, incremental analyses of costs and consequences of alternatives were also performed. The incremental cost-effectiveness ratios (ICERs) were calculated for all outcomes "by dividing the difference in total costs by the difference in outcomes" [4]. To test the robustness of the economic evaluation, two sensitivity analyses were performed from the societal perspective. It should be noted that the conclusions were sensitive due to the uncertainty in the results.
Areas of discussion included key findings, interpretation of those findings, comparison with the literature, strengths, and weaknesses. The authors interpret the results of their cost-utility and cost-effective analyses while identifying certain findings that should be viewed with caution. In addition, the authors discuss the trustworthiness of their research by elaborating on the internal and external validity of their research. They also bring light to potential sources of error and bias that could have compromised the methodological integrity of their study.

Discussion
While the study fulfills well some of the requirements set by Drummond et al [2], there are also some aspects that have not been taken into account which contributed to its weaknesses.
The evaluation is based on an RCT, and the source of data is therefore obtained from one of the most accurate data collection methods due to their design characteristics, e.g. randomisation. However, in this context, it lacks many essentials that in turn limits the validity of the results, and subsequently affects the outcome measures available for the economic evaluation. One of the limitations is the sample size which was 160 with a matching rate of almost one control to one case. In addition to that, no information about how they assigned the treatment nor how they assessed the noncompliance has been mentioned. Furthermore, the missing information on several participants and the short duration of the trial could possibly make the generalizability of the results less reliable and could indirectly influence the validity of the results of the evaluation. Also, the data regarding height and weight were collected by self-reporting which makes the final results more prone to bias and uncertainty than when collected with validated instruments and by professionals.
Regarding data on costs and outcome of health utilization and absenteeism, it is unclear how it was obtained from the control group, and whether they have made an assessment at 6 weeks to them as they did with the case group or not. Such information is important to be mentioned clearly in the final report to ensure the credibility of the results. Moreover, costs related to healthcare utilization and absenteeism have relied largely on patients' ability to recall the absent days as well as number of days they utilized healthcare because of low back pain. This can lead to recall bias, which will largely affect the internal and the external validity of the results. Furthermore, the study ignored the costs related to measures of presenteeism (i.e. decreased productivity at work due to low back pain) that could result in significant cost of chronic low back pain to the society.
Regarding the measurement of the outcome, it depends largely on how patients assessed their own level of pain. Data collected through such a method can be biased as, for example, some patients will tend to exaggerate their symptoms to increase attention.
The program included the societal and healthcare perspective only, and nothing has been mentioned regarding the patient perspective. This could have been useful to evaluate whether patients made any substantial out-of-pocket expenses in order to concur to the advice from the lifestyle intervention (e.g. gym membership), which could potentially deter some participants to follow the guidelines provided. Furthermore, by leaving out this perspective, other potential health benefits beyond the impact of a improved lifestyle on LBP were also not contemplated.
Regarding the analysis, they used 'intention-to-treat' method which assumes that all the participant had adhered to the protocol. Using such a method can lead to greater uncertainty and possible bias in estimating the cost-effectiveness of the trial. Also, when they did the sensitivity analysis, the results came out to be completely different.

Conclusion
Upon thorough assessment of the appropriateness of the economic evaluation methods used by Williams et al., it is conclusive that the validity of their results is valuable and trusted to a degree, soundly achieving many of the listed Drummond et al. [2] requirements, yet failing to take into account a few aspects that grant some weaknesses to the study.
Instances, such as missing data, utilizing a relatively small RCT sample size, lacking information on how treatment was assigned or how non-compliance was assessed, as well as data collection regarding height and weight being self-reported make the final results less of the RCT less reliable. In addition, patients assessed their own pain level as a measure of outcome, which can lead to biased data. All of the above could influence the outcome measures of the economic evaluation.
While the healthcare and societal perspectives were included, the study lacks inclusion of a patient perspective study, which would lead to a better understanding of LBP cost-effectiveness based on treatment from the patient's perspective.
When it comes to cost-effectiveness results, internal and external validity may have been affected due to recall bias from the data depending on patients' ability to recall absent days and number of days they utilized healthcare due to LBP in order to assess absenteeism and healthcare utilization, respectively. Also, it is uncertain how data on costs and outcome of absenteeism and healthcare utilization is collected from the control group.
In the end, it is lack of certain data or uncertainty of method(s) of data collection that make the study not entirely trustworthy and, as suggested by its own authors, use caution when interpreting and putting to practice its results.

Funding
No funding was received for the implementation of this study.