Academia may appear civilised to outsiders but the rivalries and feuds seen in other walks of life can be just as prevalent in universities. Even so, it’s unusual when they spill out in papers published in academic journals.
That, however, is what is about to happen in the subject of transport project management, with two Australian-based academics having had a paper accepted for publication that reads like a demolition job on the research of Bent Flyvbjerg, the BT professor and chair of major programme management at the University of Oxford, and academic director of the university’s MSc in major programme management.
Flyvbjerg has become internationally-renowned for his work on why major infrastructure projects often end up costing more and taking longer to deliver than originally promised. He says the answers lie in behavioural psychology, with project promoters either being over-optimistic or deliberately underestimating costs in order to win funding.
Flyvbjerg’s landmark 2002 paper, ‘Underestimating costs in public works projects: Error or lie?’, co-authored with Holm and Buhl, was published by the Journal of the American Planning Association. Since then, his explanations have won widespread acceptance, including from the UK Government, which has reformed project management through mechanisms such as optimism bias adjustments and passing more of the risk for cost overruns on local transport projects to the promoting local authorities.
But his work has failed to persuade Peter Love, professor of infrastructure and engineering management at the School of Civil and Mechanical Engineering, Curtin University, and Dominic Ahiaga-Dagbui, a lecturer in construction management at Deakin University. They have authored an excoriating critique of Flyvbjerg’s work, which has been accepted for publication in the academic journal Transportation Research A: policy and practice. The paper’s title is ‘Debunking fake news in a post-truth era: the plausible untruths of cost underestimation in transport infrastructure projects’.
Setting out their case, Love and Ahiaga-Dagbui say there are two schools of thought for cost overrun causation: “Evolutionists suggest that overruns are the result of changes in scope and definition between the inception stage and eventual project completion. Psycho-strategists attribute overruns to deception, planning fallacy and unjustifiable optimism in the setting of initial cost targets.
“Unfortunately, it would appear policy-makers and the media have accepted, at face value, the delusion and deception explanations of cost under-estimated causation, despite the lack of empirical evidence to support these conclusions.”
Love and Ahiaga-Dagbui say that, if policy-makers are to ensure their projects are delivered cost-effectively and continually improve in performance, “then it is necessary they stop listening to the rhetorical spin that has been frequently promulgated by Flyvbjerg et al (2002) and instead rely on facts that can be used to make informed decisions about capital cost estimates and potential risks”.
They say that for decades cost overruns were considered a project management issue; that is, they could be addressed through better methodologies for cost estimation and project execution. But Flyvbjerg et al (2002) “indicated that the problem lay within the institutional domains and therefore required a focus on governance, specifically how projects are initiated, their selection and the sharing of accountability between the actors involved in the planning of transport infrastructure projects”.
“A detailed examination of the Flyvbjerg, Holm and Buhl research raises serious questions regarding the methodology adopted, the analysis undertaken, and unfounded conclusions reached. Needless to say, the authors should probably be congratulated, as they have fooled many people with their creative and rather convincing narratives that sensationalise the causes of cost overrun in transportation projects.
“A detailed critique at the time would have revealed the findings were akin to fake news. For example, the sample of projects is statistically unrepresentative. The dataset is unreliable and the reference point from which cost underestimation is determined is ambiguous, resulting in inaccurate and exaggerated cost overrun figures being reported.
“The primary myth, however, is that: ‘Underestimation cannot be explained by error and is best explained by strategic misrepresentation, that is, lying.’
“No evidence at all supports the causal claims of delusion and deception as the main explanations for cost underestimation in transport infrastructure projects.”
Love and Ahiaga-Dagbui accept that cost overruns are a common problem for transport projects and accept too that, in some cases, initial project cost targets can be influenced by factors such as deception and unjustifiable optimism. “Similarly, from a political stance, politicians often announce the projected cost of infrastructure projects well in advance of detailed engineering drawings and costing, usually to fulfil pre-election commitments or to attract new voters.”
But they say the claim that cost underestimation is either down to error or lying “presents the reader with a false dichotomy”. “[It is] an either/or choice that is practically invalid when juxtaposed with the real-world nature of procuring large infrastructure assets. This false dichotomy forces the reader to reject complexity in complex decisions and focus on only the two extremes presented, with the assumption that no middle options are available. When Flyvbjerg et al. (2002) posit the error or lie false dichotomy, they fall foul to the ‘Fallacy of the Excluded Middle’ as there are many other explanations of cost underestimation.
“A lie is a false statement that is deliberately created by someone to intentionally deceive others; deception requires justification. There needs to be a motivation to enact the lie. But, the grounds for producing deceitful cost estimates are not empirically examined in Flyvbjerg et al (2002;2003a;2004;2005;2009).”
A big chunk of their paper focuses on Flyvbjerg’s 2002 paper, which used a dataset of 258 transport infrastructure projects – 167 roads, 58 railways, and 33 bridges and tunnels. The projects were built between 1910 and 1998 in 20 countries across five continents.
“The approach adopted by Flyvbjerg et al (2002) signals methodological alarm bells with regard to: 1. The quality of the data, particularly relating to its accuracy and the rigor used in its collection; 2. Issues of validity and reliability, as well as the format of the data.
“Flyvbjerg et al (2002) states ‘even if the project planning process varies with project type, country, and time, it is typically possible to locate for a given project a specific point in the process that can be identified as the time of decision to building the project’.
“This is a misconception as no international standards exist to determine the level of detail needed to formulate an estimate at the time the decision to build is made. It will naturally, therefore, vary depending on governments’ decision-making processes.
“What is even more apparent is that undertaking any form of comparative study on the accuracy of estimated costs with this dataset would be nonsensical to those who are experienced in this area of research as the: 1. use of different technologies (e.g. construction methods, plant and equipment), standards and requirements exist at various point in time and between countries; 2. the forms of project delivery strategy would vary, particularly the funding mechanisms used to finance projects; 3. legal systems are inherently different between countries and naturally these would have evolved and become more mature over the period; and 4. environmental regulations, requirements and restrictions placed on projects in 1928, for example, are different from those in 1950 and in 1995 for all countries.
“In consideration of the above, the authors leave it to the reader to decide as to the credibility of the data.”
Love and Ahiaga-Dagbui say that “in a flagrant strategy to garner attention”, Flyvbjerg et al. (2002) commenced their paper with the statement: ‘This article presents results from the first statistically significant study of cost escalation.’
They say that, during the period of Flyvbjerg’s analysis thousands, if not hundreds of thousands, of transport infrastructure projects were constructed ranging in size and nature. “To obtain a statistically significant sample, Flyvbjerg et al. (2002) would have needed a considerably larger set than the mere 258 projects that they have relied upon.
“A never-ending factoid that has emerged from the original Flyvbjerg et al. (2002) study and resonates throughout the literature is that nine out of ten transport projects worldwide experience cost overruns. Despite the unrepresentative nature of the sample, many academics have and continue to peddle this canard.
“It is not only an exaggeration to claim that almost all transport projects (i.e. nine out of ten) are delivered over budget, but misleading. For example, Terrill and Danks’ 2016 analysis of a much larger dataset of 836 transport projects valued at AU$20 million or more, planned or completed since 2001 in Australia, revealed that ‘the majority of projects come in close to their announced costs’. In fact, 66 per cent were either delivered on budget or under the budget. Only 34 per cent overrun their budget.”
They end by questioning Flyvbjerg’s recommendation of reference class forecasting (RCF) – comparing the cost of the project at hand with similar projects – to improve project costings. “Unfortunately, RCF has been adopted by governments in several countries based on the recommendations by Flyvbjerg and COWI (2004) as they have sold it as being best practice.
“To simply assume that a given project is comparable to past and completed projects and that a lump sum up-lift could be added to account for all uncertainties is a gross oversimplification of reality.”
Critics don’t understand behavioural science, says Flyvbjerg
Bent Flyvbjerg said this week he would be preparing a full rebuttal of Love and Ahiaga-Dagbui’s paper for the academic journal in which it will be published. In the meantime, he provided LTT with this response:
“As long as cost estimation for large transport infrastructure projects is understood in the manner described in the paper by Love and Ahiaga-Dagbui, planners and managers will keep getting costs wrong.
Incredibly, the authors ignore 30-40 years of results in behavioural science, including the Nobel Prize-winning work of Daniel Kahneman and Amos Tversky on heuristics and biases. There is not one reference or any mention of this work in the paper by Love and Ahiaga-Dagbui.
To illustrate the consequences, Love and Ahiaga-Dagbui emphasise ‘complexity in complex decisions’ and ‘changes in scope’ as the main causes of cost overrun.
Behavioural scientists would agree that complexity and scope changes are relevant to understanding what goes on in capital projects, but would not see them as root causes of cost overrun. The root cause of cost overrun, according to behavioural science, is the well-documented fact that planners and managers keep underestimating complexity and scope changes in project after project.
‘Your biggest risk is you,’ says behavioural science. It is not complexity and scope changes in themselves that are the main problem; it is how human beings misconceive of these phenomena.
This is a profound and proven insight that behavioural science brings to capital investment planning. Until you understand it, you’re unlikely to get such investments right, including cost estimates.
Love and Ahiaga-Dagbui clearly do not understand. Instead they postulate that this insight is fake news, which is an ironic piece of fake news in itself.
As a thought experiment, assume Love and Ahiaga-Dagbui were right with their postulate about fake news. This would mean that two dozen editors and referees for some of the leading planning and management journals in the world – including Journal of the American Planning Association, International Journal of Project Management, Oxford Review of Economic Policy, European Planning Studies, California Management Review, Environment and Planning, Transport Reviews, and Cities – would all have either overlooked that the results were fake news or would knowingly have published fake news. But written reports and feedback from the editors and referees show that they saw the results from behavioural science as a significant contribution to scholarship on cost estimation and capital investment planning.
Love and Ahiaga-Dagbui write about reference class forecasting (RCF): ‘To simply assume that a given project is comparable to past and completed projects and that a lump sum up-lift could be added to account for all uncertainties is a gross oversimplification of reality.’ I agree this is a simplification, like any forecast will be. But Love and Ahiaga-Dagbui overlook the fact that this simplification has been documented to produce better estimates on average than any other simplification.
Love and Ahiaga-Dagbui think that by coming up with a more detailed (less simple) account of costs they can better the results of RCF. That more detail would lead to more accuracy sounds intuitively right. But it is wrong. It is a classic example of our heuristics and biases tripping us up, as shown by behavioural science. It is similar to going to Las Vegas thinking you can beat the basic odds of the casinos by a more detailed understanding of how gambling works. You cannot. RCF gives you the basic odds of cost estimation and on average you will be better off sticking to these odds than anything else you can think up.
I understand this may be an unsettling conclusion for a conventional cost engineer, because it shows that much of what conventional cost engineering does can be bettered by simpler and theoretically more sound methods. But such is innovation in cost estimation today. Accept it and you may prosper; deny it and you will be left irrelevant.
Love and Ahiaga-Dagbui also write that ‘Reference Class Forecasting (RCF) ... utilizes a normal distribution’. This is factually wrong. RCF uses the empirical distribution in the reference class, whatever it is. For large infrastructure projects, typically the distribution is not normal but asymmetrical and fat-tailed.
In addition, Love and Ahiaga-Dagbui write that ‘an estimate for a large infrastructure project should include the estimated uncertainty measured by the relative standard deviation’. Again this is wrong. Distributions of cost overrun for large infrastructure projects are asymmetrical and fat-tailed, as said. For such distributions the standard deviation is not a good measure of uncertainty. The standard deviation ignores fat tails and gives the impression that distributions are symmetric, i.e. that overruns and underruns around the central value are equally likely, which is emphatically not the case for large infrastructure projects.
Kahneman and Tversky advocate instead presentation of the full distributional information as the preferred and most transparent option, which is what my colleagues and I do when we do RCFs. Nassim Nicholas Taleb advocates the use of the median absolute deviation as a more robust measure of variability than the standard deviation. In any case, no one familiar with the actual distribution of cost overrun in large infrastructure projects would recommend the standard deviation as a measure of uncertainty. The fact that cost engineers advocate and use this measure helps explain why their estimates are so consistently inaccurate. Garbage in, garbage out.
Love and Ahiaga-Dagbui further write that ‘it is more appropriate to use the median rather than the mean, which Flyvbjerg (2008) utilises when applying RCF’. Factually wrong again. First, the median and the average may both be used by forecasters, depending on what level of certainty they wish to achieve for their forecast. Second, the median, also called the P50, is mostly used for a portfolio of projects, because it indicates a fifty-fifty risk of going over/under budget, which means that projects that go over will be balanced by projects that go under. Third, where clients manage just one or a few projects, which is typical for megaprojects, clients are often more risk averse than portfolio managers. In this case neither median nor average is used in RCF, but higher P-values, often the P80, which indicates 80 percent certainty of staying within budget and 20 percent risk of going over. Fourth, I do not use the average in RCF, I use the value that corresponds to my client’s risk appetite, and the client decides. Typically this corresponds to the more conservative P80, but recently I did an RCF where a client representing a multi-billion-dollar megaproject was even more conservative and wanted 95 percent certainty of staying within budget. So for this project we used the P95 to estimate cost and contingencies.
As a final point regarding RCF, Love and Ahiaga-Dagbui cite the Edinburgh Tram as an example of inappropriate use of RCF. But they make a hash of their numbers by comparing monetary values that are not given in the same year’s prices and cover different cost items, undermining their conclusions. It is unsettling to see experienced cost engineers make such basic mistakes of comparing apples and oranges. For up-to-date and consistent numbers on the Edinburgh Tram, see my Report for the Edinburgh Tram Inquiry (February 2018, co-authored with Alexander Budzier), written as expert witness for the inquiry.
Love and Ahiaga-Dagbui rightly say evidence should decide truth claims. Today, a dozen independent evaluations exist with evidence that supports the accuracy of RCF over other estimation methods, for large and small projects alike. Here is the conclusion from one such evaluation, covering construction projects: ‘The conducted evaluation is entirely based on real-life project data and shows that RCF indeed performs best, for both cost and time forecasting, and therefore supports the practical relevance of the technique.’ (Batselier, J. and Vanhoucke, M., 2016, ‘Practical application and empirical evaluation of reference class forecasting for project management’, Project Management Journal, 47:5, p. 36).
I agree with Love and Ahiaga-Dagbui that larger samples than the 258 projects in our original study are desirable. We explicitly said about the sample in the 2002 paper (Journal of the American Planning Association, vol. 68, p. 294): ‘The question is whether the projects included in the sample are representative of the population of transportation infrastructure projects ... There are four reasons why this is probably not the case.’ For Love and Ahiaga-Dagbui to point out 16 years later that the sample ‘falls short of providing a statistically representative sample of the population’ is not news; or it is fake news in the jargon of Love and Ahiaga-Dagbui.
My colleagues and I have continued to enlarge and update the original dataset. The latest published results are based on a sample of 2,062 projects, the biggest and best of its kind, now including both costs and benefits (World Development, vol. 84, 2016, pp. 176–189). The new study shows the original 2002 results to be robust across more observations, more project types, more geographies, and a longer time span, now covering almost a hundred years.
The study also shows that if you want to get cost estimates right, you need to unlearn parts of conventional cost engineering, with its century-long record of getting estimates wrong. Instead you need to learn behavioural science, because behaviour is the main problem, not artifacts.”
TransportXtra is part of Landor LINKS
© 2018 TransportXtra | Landor LINKS Ltd | All Rights Reserved
Subscriptions, Magazines & Online Access Enquires
[Frequently Asked Questions]
Email: [email protected] | Tel: +44 (0) 20 7091 7857
Shop & Accounts Enquires
Email: [email protected] | Tel: +44 (0) 20 7091 7855
Advertising Sales & Recruitment Enquires
Email: [email protected] | Tel: +44 (0) 20 7091 7861
Events & Conference Enquires
Email: [email protected] | Tel: +44 (0) 20 7091 7865
Press Releases & Editorial Enquires
Email: [email protected] | Tel: +44 (0) 20 7091 7875