Improving reliability in forensic engineering: the Delft approach

Based on established theories from literature and best practices of forensic investigations in aerospace engineering, civil engineering and biomechanical engineering, the Delft University of Technology has developed a Delft approach for forensic investigations. This integrated approach consists of three elements. First, because a product has a life cycle with various phases, it is of importance to consider these phases when a failure is investigated. Second, it is acknowledged that failure is a multifaceted phenomenon. Therefore, the ‘ Tree House of Failures ’ was developed, a taxonomy or categorisation of failure causes, which addresses main groups of causes of failure related to product, instruction and execution. Third, use of a standard investigative approach with the steps ‘ orientation ’ , ‘ data collection ’ , ‘ hypotheses generation ’ , ‘ hypotheses testing ’ , ‘ recommendations ’ and ‘ ﬁ ndings reporting ’ is advised. In the Delft approach, the ‘ ring of trustworthiness ’ is used to underline the mind-set that a forensic engineering investigator should have to assure the investigation ’ s reliability and validity. The ring of trustworthiness states that an investigation should be objective, repeatable, veri ﬁ able, complete and correct. This paper presents the Delft approach for forensic investigations and explains how to use it to prevent several common pitfalls and biases that occur in various stages of a forensic engineering investigation. This approach aims to increase the reliability of forensic engineering investigations worldwide.


Introduction
Forensic engineering can be defined as 'the professional practice of determining the cause or causes of failure of a constructed facility and of laying out the technical bases for identifying the parties responsible for that failure' (Ratay, 2009: p. 53). The primary goal for forensic engineers should be to set out all factual information that can be gathered after a failure in order to identify the cause of the failure. Unfortunately, there are usually limits to the resources available (time, manpower and money), putting constraints on how extensive an investigation can be made. The role of a forensic engineer is not only to identify but also to communicate the cause(s) of failure. Therefore, the importance of reporting cannot be overestimated. Failure of a technical system can be defined as a state or condition in which that technical system cannot observably or measurably fulfil some or any of its intended functions during its use.
The reason to investigate failures of technical systems can be twofold.
(a) Investigations focused on safety aim to find ways to improve or prevent similar events. The following questions are of importance in safety investigations. 'What happened?' 'What caused it to happen?' 'How could it be avoided next time?' 'How could the technical system be improved to avoid future failures?' (b) Investigations with a legal focus where the aim of the investigation is to find the party or person responsible for the failure: apart from the question 'what happened and why?', the central question in these kinds of investigations is 'who is responsible for the cause of the failure and its consequences?' In this case, the investigation is in search of responsibility and liability.
Forensic engineering investigations have been performed for ages. Hammurabi's code (around 1750 BC) stated firmly that 'if a builder builds a house for someone, and does not construct it properly, and the house which he built falls in and kills its owner, then that builder shall be put to death'. So it can be imagined that even in those times, someone had to be appointed to check if a house was constructed properly or not and to conclude if the builder was liable or not.
Yet in many countries, guidelines for conducting forensic investigations do not exist. As a consequence, many forensic engineering companies have their own specific approaches. As these approaches are internal company procedures, it is not always clear whether these investigations lead to trustworthy outcomes. In the authors' opinions, the quality of these investigations varies. In many cases, reports are not subjected to clear requirements regarding investigation rigour and depth. Furthermore, in various legal cases forensic engineers show opposing views. For assessors, such as insurance companies or judges, it is hard to establish which of two (or more) opposing expert opinions is right. Therefore, there is a need for principles that increase the trustworthiness of investigations.
Furthermore, forensic investigations occur in many domains, and there are many types of failure. For instance, airplanes crash, medical instruments get contaminated and buildings collapse. Aviation has been investigating failures for many years to improve safety. The International Civil Aviation Organization was founded in 1944 and established rules of airspace, aircraft registration and safety (Chicago Convention, 1944). Annexes to the convention guidelines and protocols for investigating accidents were formulated (Annex 13). According to these annexes, the final investigation report should promote aviation safety and not apportion blame or liability. The safety culture within structural engineering seems less developed compared with the aviation or process industry (Terwel and Zwaard, 2012). Therefore, it seemed worthwhile to include and combine insights into various industries when developing a forensic approach.
The Safety Methods Database combines insights into various industries (Everdij and Blom, 2016). In this database, maintained by the Dutch National Aerospace Laboratory, an overview is given of more than 840 investigation methods applied in various domains. Among the identified methods are barrier analysis, multi-event sequencing, the Swiss cheese model and timeline analysis. The focus of these methods is usually on a single step of an investigation or on a specific domain. A general approach to an investigation from the fact-finding stage up to defining recommendations to learn from failures is not addressed.
Furthermore, several general approaches are available (such as the book by Noon (2001) and the study by Esreda (2009)), covering various stages of an investigation. However, a clear integral approach, addressing all life-cycle phases of a product, clear failure taxonomy, steps of an investigation and quality assurance that can be used in various domains, is currently lacking. Therefore, three researchers from different faculties at the Delft University of Technology developed an integral forensic engineering approach, the Delft approach for forensic investigations. This approach is suitable to be applied across various domains. In this paper, first, several threats to reliable forensic investigations are highlighted. Subsequently, the paper presents the Delft approach for forensic investigations, which acknowledges the technical system's life cycle, includes various failure characteristics, introduces a stepwise approach for conducting an investigation, and provides a strategy to increase the trustworthiness of an investigation.

Threats to trustworthy investigations
Various threats to trustworthy investigations can exist. In the following, three types of threats are briefly discussed: general biases of the human mind, human errors and other error types. Kahneman (2011) highlights several limitations of the human mind that can be of relevance for forensic investigations. Kahneman explains that humans have two systems for decision making. The first decision system works fast and intuitively and is based on assumptions and earlier experiences. The second decision system works slower, but is able to assess various options thoroughly. However, the slow system is energyconsuming and tiring. Therefore, one of the major pitfalls is that one prefers to use the fast system, which entails the tendency to jump to conclusions, without performing proper analysis. In general, this often leads to satisfying results, but in a forensic investigation, one might miss essential elements when omitting thorough analysis.

General biases of the human mind
Three general biases of the human mind that can hamper coming to reliable conclusions are confirmation bias, availability bias and contextual bias.
The human mind, and in particular the fast decision system, is prone to confirmation bias. People tend to seek for data that are 'likely to be compatible with the beliefs they currently hold' (Kahneman, 2011: p. 81). One has the tendency to 'steer clear of information that may disagree with those prior beliefs' (Budowle et al., 2009: p. 803). A special type of confirmation bias is outcome bias: one tries to match all available information with a predetermined conclusion. Another type of confirmation bias is group thinking (Esreda, 2009), where one tends to adjust a personal opinion to the group opinion. A final type of confirmation bias is the 'halo effect'. Many people tend to like or dislike everything about a person once they have had a positive (or negative) first impression of this person. This can be important when an investigator interviews, for instance, a very attractive person who is not necessarily more reliable than a less attractive one.
Kahneman asks that attention be paid to a second general bias: availability bias, which is about basing conclusions only on the information that is available, without looking beyond the data available at that moment. He calls this 'WYSIATI': what you see is all there is, which means that one tends to believe that all that is seen is all that exists. This principle can result in overconfidence, leading one to be satisfied when a nice story can be made up from what is seen, even when little is seen. Budowle et al. (2009) additionally pointed out contextual biases, which are about using existing information to reinforce a point of view, even when the used information is not necessarily related to the particular case under review.
Several authors stress that these three kinds of biases are very common and need to be recognised and acknowledged (Budowle et al., 2009;Byrd, 2006;Christensen et al., 2014). However, while biases may influence the decision-making process, they do not necessarily have to influence the outcome of the investigation (Dror et al., 2012).

Human errors
Apart from general biases of the human mind, humans are prone to making errors in the execution of tasks. This also applies to forensic investigation tasks. Swain and Guttman (1983) distinguish errors of omission (failure to perform a task) and errors of commission (incorrect performance of a task). Reason Forensic Engineering Improving reliability in forensic engineering: the Delft approach Terwel, Schuurman and Loeve (1990: p. 9) makes a similar distinction in two types of errors, namely, slips/lapses and mistakes.
■ Slips/lapses are 'errors which result from some failure in the execution and/or storage stage of an action sequence, regardless of whether or not the plan which guided them was adequate to achieve its objective'. Slips and lapses often are the result of fatigue, forgetfulness or habits (Kletz, 2001) and apply to skill-based tasks. ■ Mistakes are 'deficiencies or failures in the judgmental and/or inferential processes involved in the selection of an objective or in the specification of the means to achieve it, irrespective of whether or not the actions directed by the decision-scheme run according to plan…'. Mistakes can be regarded as an ignorance of the correct task or of the correct way to perform it (Kletz, 2001). Mistakes can be rule-based or knowledge-based and are similar to the errors of omission from Swain and Guttman (1983). Kletz (2001) also adds that errors can be conscious non-compliance or violations. Furthermore, Bea (1994) highlights that errors can be made by operational personnel, but in many cases, management personnel are responsible. Terwel (2014) further addresses that errors can be found not only on an individual level, but also on organisational levels and in the interaction between various parties.

Other errors
Apart from human or 'practitioner' errors, Christensen et al. (2014: p. 124) point out instrument error, statistical error and method error in the approaches used specifically to test hypotheses. Instrument error is defined as 'the difference between an indicated instrument value and the actual (true) value'. Statistical error is defined as 'the deviation between actual and predicted values' (Christensen et al., 2014: p. 124), and method error is related to limitations in a method itself.
These general biases, human errors and other errors are threats to reliable investigations. Therefore, the Delft approach for forensic investigations was developed with the aim of increasing the reliability of forensic investigations.

Delft approach for forensic investigations
Now possible threats for reliable forensic investigations are known, and the various elements of the Delft approach for forensic investigations are explained: life-cycle phases, 'Tree House of Failures', steps of a forensic investigation and the ring of trustworthiness. Finally, measures to increase trustworthiness are explored.
3.1 Life-cycle phases A technical system goes through various life-cycle phases (see Figure 1) ■ develop, where the technical system is specified and designed ■ produce, where the building/production/manufacturing/ assembly takes place ■ utilise, the actual use phase including maintenance and repair ■ recycle, where parts of the technical system are recycled or disposed of.
In each phase actions are performed that need to be verified. Rules and regulations usually apply to every phase of the life cycle. In various phases, requirements or specifications are stated that affect other life-cycle phases.
Various industries use different names for similar life-cycle phasesfor example, within the building industry, the phases of a project are often called initiative, design, construction-ready, construction and use (Terwel, 2014). For different kinds of projects and different types of industries, the lengths and contents of phases can vary, but the basic concept as presented in Figure 1 applies in the majority of cases. The cause of a failure can be rooted in any of these phases.

Tree House of Failures
In order to aid forensic engineers in systematically taking all plausible potential causes of failure of technical systems into consideration during an investigation, a hierarchical checklist was developed: The Tree House of Failures (see Figure 2). The figure shows a diagram with three main groups (failure carriers) of potential causes, related to product, instruction or execution of these instructions. An investigation may start by identifying product flaws. These flaws can be related to the causal stems (second level in Figure 2) of integrity or ergonomics. Integrity covers issues stemming from faults in the physical integrity of the technical systemfor example, in the construction, electrical system, heat transfer, chemical aspects or other physical features. Ergonomics covers problems stemming from the design of the technical system that hamper easy and error-free use of the system.
Flaws in the integrity of a technical system can be rooted in ■ configuration (related to completeness and set-up of the total system) ■ geometry (related to shape and size of parts) ■ material (related to properties of materials and possible deterioration) ■ intactness (related to any signs of detachments, tears, fractures, wear and erosion) ■ purity (related to anything that is in the system that should not be there) ■ dependencies (related to issues arising from connections to or relations with the system's surroundings, such as excessive loads).
Instructions can be issued by a government (e.g. laws and regulations), a field (e.g. standards and codes), the maker of the technical system (e.g. design requirements, technical drawings, user manuals, safety and maintenance instructions) or the organisation where the technical system is used (e.g. internal work instructions within a company). When instructions are flawed, one should check if this has to do with applicability, Forensic Engineering Improving reliability in forensic engineering: the Delft approach Terwel, Schuurman and Loeve validity or availability. Applicability covers whether the correct instructions were used. Invalid instructions are instructions that were not suitable for the purpose. Availability is about whether the required instructions were made available to the group or individual that should have executed these instructions.
Finally, the failure carrier execution shows that causes of failure can be related to the execution of instructions and the acts of parties developing, producing or using a product. Executionrelated causes can stem from a lack of knowledge, incorrect application of rules or a lack of skills, resulting in errors in performing an action or routine (based on the book by Reason (1990), who distinguished knowledge-based mistakes, skill-based slips and lapses and rule-based mistakes).
Knowledge-, rule-and skill-based errors can be rooted in a flawed attempt, incorrect choices, lack of ability or insufficient awareness.
3.3 The six steps of a forensic engineering investigation In the literature various possible steps for a structured forensic engineering investigation can be found. Several authors (Borsje et al., 2014;Brady, 2012;Budowle et al., 2009;Noon, 2001) promote a scientific approach of collecting information, developing hypotheses for the cause of failure, testing each hypothesis against the data and determining the most probable cause of failure. A similar approach is also advocated by Esreda    In an attempt to cover all relevant steps suggested in the sources mentioned earlier, the Delft approach for forensic investigations defines these steps as ■ orientation ■ data collection ■ hypotheses generation ■ hypotheses testing ■ findings reporting ■ recommendations.
During orientation, one determines the stakeholders, objectives and scope of the investigation; what expertise is needed to reach the objectives and a data collection strategy. Furthermore, one has to decide if one is qualified to perform this investigation and if there are any conflicts of interest.
During data collection, information about the failure(s) is collected in a field investigation or by desk research. In the field investigation, the technical system is observed as is (in its failed state, which does not necessarily entail physical damage). Furthermore, interviews may be performed with witnesses and other relevant persons. Samples may be collected for further testingif requiredand records may be taken for examination. Similar events, data logs, reports, design and construction drawings or production instructions are examined during desk research.
During hypotheses generation, the investigator compiles a list of possible explanations for what may have caused the failure(s). What might have been the technical or procedural causes? What was the chain of events leading to the failure? What ultimately triggered the failure onset (the root cause)? It should be noted that any observed failure could be in itself the result of another failure or a cause of a subsequent failure. To aid in systematically taking all possible causes into consideration and avoid overlooking any plausible causes when generating hypotheses, a failure exploration routine consisting of four steps is proposed. During hypotheses testing, the hypotheses are tested against the observed failure using the collected data. It is checked whether the collected data provide a logical explanation for the observed failure. This can be done by reasoning or by conducting validation calculations, simulations or real-life tests. Sometimes, it is necessary to look for additional hypotheses or data if the generated list of hypotheses does not provide a satisfying answer to the main investigation questions or if the available data are insufficient to answer all questions or test certain hypotheses.
During findings reporting, all steps of the investigation should be addressed in a well-written report. The report ultimately should achieve the objectives stated in the orientation phase.
In a safety investigation, recommendations conclude the investigation with the aim to prevent similar failures in future ( Figure 3).
The six steps of a forensic engineering investigation, the Tree House of Failures and life-cycle phases should provide a clear framework to perform a structured and reliable forensic investigation systematically.

Ring of trustworthiness
A forensic investigation can be regarded as a case study. A case study is 'an empirical inquiry that investigates a contemporary phenomenon in depth and within its real-life context, especially when the boundaries between phenomenon and context are not clearly evident' (Yin, 2009: pp. 40-41). For a case study, several requirements regarding reliability and validity apply (Yin, 2009). In general, reliability and validity of a method have to do with the 'applicability of the method to the question asked, and the suitability of the method for the intended purpose' (Kardon et al., 2006: p. 1). Reliability means that if under the same conditions, the study were replicated, the same results would be achieved. Reliability refers to the stability of findings (Whittemore et al. , 2001). Validity includes that the research has the quality of being logically or factually sound (Oxford Dictionary). It represents the truthfulness of the findings (Whittemore et al., 2001). For validity, a distinction can be made between external, internal and construct validities (see the study by Kardon et al. (2006)  Several scientists argue that reliability and validity are valuable concepts for quantitative research, but are not always applicable for qualitative research (Whittemore et al., 2001). Lincoln and Guba suggest that trustworthiness in qualitative research is the equivalent of rigour or reliability and validity in quantitative research (Morse et al., 2002). Trustworthy can be defined as 'able to be relied on as honest or truthful' (Oxford Dictionary). Accuracy of the results should therefore be key.
Forensic investigations can be regarded as case studies with mixed use of qualitative and quantitative aspects. Therefore, new criteria for trustworthy investigations were developed for the Delft approach, particularly because the concepts of reliability and validity may be too abstract for adequate use in forensic investigations.
As a starting point, the criteria that Lincoln and Guba proposed for trustworthy research were used because these are considered to be the 'gold standard' in qualitative research (Whittemore et al., 2001).
■ Credibility shows congruency of findings with reality (preferred over the abstract term internal validity (Shenton, 2004)). ■ Transferability shows that the results are applicable to a different setting (as a substitute for external validity/ generalisability (Shenton, 2004)). ■ Dependability shows that if the research was repeated, in the same context, with the same methods and same participants, the same results would be obtained (as a substitute for reliability (Shenton, 2004)). ■ Confirmability shows that the work's findings are the result of the experiences and ideas of the informants and based on the given data, rather than the characteristics and preferences of the researcher (Shenton, 2004).
These criteria are also useful for forensic investigations, but it was decided not to include transferability because in many forensic investigations this is not a goal in itself. Furthermore, credibility was split into completeness and correctness, dependability was split into repeatability and verifiability (to be more specific) and confirmability was renamed as objectivity (a term with wider recognisability).
In the Delft approach, these trustworthiness criteria are part of the ring of trustworthiness (see Figure 4), consisting of the following elements (definitions loosely based on the Oxford Dictionary) ■ objective: not influenced by feelings or opinions or other biases in considering and representing facts ■ repeatable: results of experiments and analyses can be fully reproduced based on the descriptions in the report ■ verifiable: all presented information, and the way it was obtained, is provided in a transparent way, so it can be checked or demonstrated to be true and accurate or justified ■ complete: containing everything that is necessary or appropriatefor example, not missing any information that is necessary to understand the context, information, approach and decisions ■ correct: free from error; in accordance with fact or truth.
It should be noted that these criteria should not only be used to evaluate the trustworthiness of a report retrospectively (Morse et al., 2002) but should also be incorporated during every step of an investigation, so that the investigation will result in trustworthy outcomes.

Measures to improve trustworthiness
In the literature, several suggestions are given to increase the trustworthiness of qualitative research. Creswell and Miller (2000) list various procedures for establishing trustworthiness: checking, triangulation (using various sources/approaches), 'thick' description (providing more than minimal information in descriptions), peer reviews, external audits, including disconfirming evidence, prolonged engagement in the field (doing more than just a quick desk study to understand a phenomenon), research reflexivity (reflection on approaches and assumptions by the investigator) and collaboration. An example of very thorough checking or reviewing is cross-examination, as used in the UK legal system, where opposing parties meticulously question the assumptions, methods and findings of the other party. This is a rigorous way of checking the validity of the statements of a forensic engineer. Shenton (2004) provides a list of measures he used in his research to increase aspects of trustworthiness. To increase credibility, Shenton used well-established research methods, triangulation, negative case analysis, peer scrutiny of research project, reflective commentary, background, qualifications and experience of the investigator, member checks of data collected and interpretations, thick description of phenomenon under study, and examination of previous research to frame findings. To increase dependability, Shenton advocates employment of overlapping methods and indepth methodological descriptions to allow studies to be repeated.  To improve confirmability, investigator bias was reduced by using triangulation and a critical reflection on researchers' beliefs and assumptions with recognition of shortcomings in study methods and their potential effects. Budowle et al. (2009) give several suggestions for quality assurance in forensic science: adherence to using validated and documented protocols, using tested reagents, using calibrated equipment, using appropriate control samples and applying recognised, detailed and methodical documentation requirements and independent review of operations, results and interpretations. Furthermore, ISO 17025 (ISO, 2017) provides a list of requirements that are relevant for laboratories for reliable forensic testing.
From the listed recommendations to increase the trustworthiness of an investigation, a number of recommendations were extracted related to the elements of the ring of trustworthiness.
■ For being objective: stay factual, avoid mixing facts and opinions. Include internal or external review (including crossexamination) of the analyses and findings, and use various sources (triangulation). ■ For being repeatable: use a systematic approach, include a detailed description of the methods used and list every step of the methods used. ■ For being verifiable: write a structured report and provide evidence and reasoning for all findings. Safely store relevant evidence (debris, recordings and records) during the course of the investigation. use 'thick' descriptions (elaborate description with attention for relevant details). ■ For being complete: use a systematic approach. Take some distance and consider if any data or relevant hypotheses were missed. Include counterfactual evidence and address rival explanations. ■ For being correct: follow rules of logic, use established/ validated test methods and use the four-eye principle (internal or external checking), because to err is human.

Conclusions and recommendations
The proposed Delft approach for forensic engineering investigations is based on established theories from the literature combined with practical experience with forensic investigations in the domains of biomechanical engineering, aerospace engineering and civil engineering and is believed to be applicable in other domains as well. The approach addresses the importance of the life cycle of technical systems. Combined with the Tree House of Failures, it provides a valuable framework to explore various possible causes systematically. The six steps proposed for approaching a forensic engineering investigation can structure the investigation process, and the ring of trustworthiness provides suggestions to increase the reliability of investigations and the trustworthiness of the outcomes. How can you contribute?
To discuss this paper, please email up to 500 words to the editor at journals@ice.org.uk. Your contribution will be forwarded to the author(s) for a reply and, if considered appropriate by the editorial board, it will be published as discussion in a future issue of the journal.
Proceedings journals rely entirely on contributions from the civil engineering profession (and allied disciplines). Information about how to submit your paper online is available at www.icevirtuallibrary.com/page/authors, where you will also find detailed author guidelines.
8 Forensic Engineering Improving reliability in forensic engineering: the Delft approach Terwel, Schuurman and Loeve