# Smart Infrastructure and Construction

E-ISSN 2397-8759
 Volume 173 Issue 4, December 2020, pp. 74-95 Themed issue on the application of machine learning
Open access content Subscribed content Free content Trial content

### Full Text

The proliferation of data collected by modern tunnel-boring machines (TBMs) presents a substantial opportunity for the application of machine learning (ML) to support the decision-making process on-site with timely and meaningful information. The observational method is now well established in geotechnical engineering and has a proven potential to save time and money relative to conventional design. ML advances the traditional observational method by employing data analysis and pattern recognition techniques, predicated on the assumption of the presence of enough data to describe the physics of the modelled system. This paper presents a comprehensive review of recent advances and applications of ML to inform tunnelling construction operations with a view to increasing their potential for uptake by industry practitioners. This review has identified four main applications of ML to inform tunnelling – namely, TBM performance prediction, tunnelling-induced settlement prediction, geological forecasting and cutterhead design optimisation. The paper concludes by summarising research trends and suggesting directions for future research for ML in the tunnelling space.

 A face area of the tunnel-boring machine a parameter controlling the fuzziness of the system using the fuzzy c-means clustering algorithm b adjustable bias vector C pipe string convergence c mean of the Gaussian function c′ soil cohesion D diameter of the tunnel-boring machine D c diameter of the tunnel-boring machine cutting disc d j centre of data cluster j E l Young’s modulus of the tunnel lining E s Young’s modulus of the soil EI flexural rigidity f( x ) latent function representing the underlying structure of the data g* global best historical location for the particle swarm optimisation algorithm H tunnel cover depth H w height of the groundwater table above the tunnel-boring machine i t, i l transverse and longitudinal inflection points of the soil surface settlement curve, respectively J FCM fuzzy c-means clustering objective function K coefficient of lateral earth pressure k s soil permeability k sub modulus of subgrade reaction k(x,x′) covariance function of input pairs x and x′ L tunnel length L ε (y) ε-insensitive cost function used in the support vector machine algorithm M ij fuzzy c-means clustering membership matrix n f, n hi, n o number of features, neurons in hidden layer i and outputs (respectively) in an artificial neural network n p, n c number of data points and clusters, respectively, using fuzzy c-means clustering P c chamber pressure P f tunnel-boring machine face pressure P g grout pressure Q f conditioning foam flow rate q rock quartz content R tunnel radius R d soil relative density S max maximum soil surface settlement S(X,Y) soil surface settlements at the settlement monitoring point position at X,Y s c spacing of tunnel-boring machine cutting discs s u undrained shear strength of the soil T cutterhead torque t time t c thickness of the tunnel-boring machine cutting disc t l tunnel lining thickness U max maximum horizontal soil displacement V g volume of grout V L volume loss v i velocity of particle i used in the particle swarm optimisation algorithm w adjustable weight vector X horizontal (transverse) distance to the settlement monitoring point $x i *$ local (for particle i) best historical location for the particle swarm optimisation algorithm Y horizontal (longitudinal) distance to the settlement monitoring point ahead of the face of the tunnel-boring machine α orientation of the planes of weakness in the rock mass β g, β l global and local learning parameters, respectively, for the particle swarm optimisation algorithm γ soil unit weight γ SVM support vector machine kernel coefficient ϵ Gaussian noise ζ(x) Gaussian membership function of an input value x θ tunnel-boring machine pitching angle κ slope of the soil unload–reload curve μ (x) mean vector of a Gaussian process ν l Poisson’s ratio of the tunnel lining ν s Poisson’s ratio of the soil ρ 1, ρ 2 two randomly initiated vectors with entries ranging between 0 and 1 σ standard deviation of the Gaussian function ϕ′ soil friction angle ψ′ soil dilation angle

Rapid urbanisation points to the use of underground space as one of the most viable, sustainable and efficient means of delivering new services and transport in congested urban areas. The use of trenchless technology in infrastructure construction is growing in popularity for its cost and environmental savings compared with conventional open-excavation techniques (Royston et al., 2020b). In these obstructed underground spaces, optimising the performance of tunnelling operations is critical to ensure safe and economical construction while also preventing damage to existing infrastructure both above and below ground (Chieh et al., 2020).

Traditionally, tunnelling contractors have relied on empiricism, in addition to more formal design calculations. While simplified design calculations play an important role in tunnel design and construction, optimising tunnelling operations remains technically challenging due to their dependence on several complex factors, such as site geology, tunnel-boring machine (TBM) operational parameters and tunnel geometry (O’Dwyer et al., 2018, 2020; Phillips et al., 2019). Although a significant body of research conducted over the past 30 years has greatly enhanced understanding of these effects and their influence on tunnelling operations, the literature contains many examples where static ‘rule-based’ design methods fail to provide satisfactory prediction of field behaviour – for example, the papers by Barla et al. (2006), Choo and Ong (2015) and Sheil et al. (2016).

The proliferation of data collected by modern TBMs presents a substantial opportunity for the application of machine learning (ML) to support the decision-making process on-site with timely and meaningful information (Sheil et al., 2020). While Shreyas and Dey (2019) present a high-level overview of machine techniques for tunnelling settlement and performance prediction, a more comprehensive review of recent advances and applications of ML to inform tunnelling construction operations is warranted to increase their potential for uptake by industry practitioners. To this end, this review has identified four main applications of ML to inform tunnelling – namely, TBM performance prediction, tunnelling-induced settlement prediction, geological forecasting and cutterhead design optimisation. The paper concludes by summarising research trends and suggesting directions for future research for ML in the tunnelling space.

2.1 Overview

The practice of ML has experienced immense recent growth, driven by advances in computational performance, sensing technology and data storage. In geotechnical engineering, ML advances the traditional observational method by employing data analysis and pattern recognition techniques, predicated on the assumption of the presence of enough data to describe the physics of the modelled system. These observational techniques have a proven potential to save time and money relative to conventional design (e.g. Royston et al., 2020a; Sheil et al., 2018).

‘Artificial intelligence’ (AI), ‘ML’ and ‘deep learning’ are three terms often used interchangeably to describe software that behaves in an intelligent manner. ML is a subset of AI that provides systems with the ability to learn and perform certain tasks automatically without being explicitly programmed. The most common implementation of ML involves the development of relationships between inputs and outputs. In the case where outputs are provided known labels (i.e. the correct outputs are known), then the learning process is referred to as ‘supervised’. This contrasts with ‘unsupervised’ learning, where instances do not have corresponding labels. Deep learning is a further subset of ML that uses a specific ML algorithm called ‘deep’ artificial neural networks (ANNs), with many hidden layers, to learn from large amounts of data.

A drawback of many supervised learning techniques is the requirement for a large database of high-quality information to accurately capture the physics of the modelled system. The size of the data set required for the training process is highly dependent on the type of ML technique adopted, its intended role (e.g. interpolation, optimisation, forecasting) and the complexity of the input–output relationship being modelled. This section provides a brief overview of ML techniques commonly applied to tunnelling operations.

2.2 Artificial neural networks

An ANN is an information-processing paradigm that draws inspiration from the operation of the human brain. A network consists of multiple interconnected layers of neurons, comprising a layer of input neurons, one or more layers of ‘hidden’ neurons that perform operations on the data and a layer of output neurons. Transformation of the input data is performed by the artificial neurons through the application of a non-linear function (known as the activation function) of the sum of weighted inputs (see Figure 1). In its simplest form (a feedforward neural network), data travel in one direction – from input to output. After each complete iteration, termed ‘epochs’, the network output values are compared with the target values to produce an error measurement. Feedback of the error through the network, known as ‘back-propagation’, is a non-linear optimisation process that adjusts the weight and bias of each connection towards reducing the value of the cost function. In this paper, the ‘architecture’ describes the network structure in the form n f-…n h i …-n o, where n f, n h i and n o are the number of features, neurons in hidden layer i and outputs, respectively.

Figure 1 Structure of a single artificial neuron showing mathematical operations performed on the input data

An alternative network form is a recurrent neural network (RNN) wherein connections between units form a directed cycle. This allows the network to maintain information in ‘memory’ over time and therefore use historical calculations to determine outputs. Long short-term memory (LSTM) is a type of RNN that uses a ‘memory cell’ that can store information for long periods of time. A set of ‘gates’ is used to decide whether information is stored in the memory cell, when information from the memory cell is deployed in the network or when information is removed from the cell altogether (i.e. forgotten).

2.3 Fuzzy logic

Fuzzy logic (FL) involves the integration of expert knowledge and experience into a fuzzy inference system using fuzzy ‘If–Then’ rules to model the qualitative aspects of human knowledge. This allows an extension of binary, classic logic to qualitative, subjective and approximate situations. Takagi and Sugeno (1985) presented the first systematic investigation of fuzzy modelling. The purpose of a fuzzy inference system is to map inputs to outputs through the application of fuzzy reasoning. Fuzziness is first applied to the inputs to produce a fuzzy set using a ‘membership function’, ζ(x), such as the Gaussian membership function:

$ζ = e − ( x − c ) 2 / 2 σ 2$
1

where x is the input value and σ and c are the standard deviation and mean of the Gaussian function, respectively. The resulting fuzzy set is processed using a set of If–Then rules. The results are subsequently defuzzified to produce ‘crisp’ outputs.

Adaptive neuro-fuzzy inference systems (ANFISs) denote the fusion of neural networks with FL principles. The key difference to traditional neural networks is that part or all nodes in the network are modified to be ‘adaptive’. This means that the outputs of the network are now dependent on the nodal parameters and the learning rule updates the parameters to minimise a prescribed error measurement. Relationships between variables are defined using fuzzy If–Then rules. ANFIS networks are typically organised in five layers as follows: (a) layer 1 is the input layer comprising the adaptive nodes and node functions and activates the fuzziness of the inputs, (b) layer 2 determines the firing strength of each rule, (c) layer 3 normalises the firing strengths, (d) layer 4 defines the consequence parameters and (e) layer 5 computes the ANFIS outputs by summing the outputs of layer 4.

2.5 Fuzzy c-means clustering

Conventional clustering techniques assign data to a cluster without consideration of the extent of its ‘belonging’ to that cluster. First introduced by Dunn (1973), fuzzy c-means clustering (FCM) is a clustering approach that allows a data point to belong to multiple clusters with varying degrees of membership. This method uses an iterative clustering technique to produce an optimal ‘d’ through the minimisation of an objective function J FCM:

$J FCM = ∑ i = 1 n p ∑ j = 1 n c M i j a ‖ x i − d j ‖ 2$
2
where n p and n c are the number of data points and clusters, respectively; M ij is the membership matrix; a > 1 is a parameter controlling the fuzziness of the system; and
$‖ x i − d j ‖ 2$
is the squared Euclidean distance between observation x i and cluster centre d j .
2.6 Classification and regression trees and random forests

Classification and regression trees (CARTs) are a non-parametric method that builds classification or regression models in the form of a tree structure. At each tree node, a specified number of features are randomly selected and tested to achieve an optimal split of the data. Although decision trees can be highly effective, they are prone to overfitting and are sensitive to the specific data set on which they are trained. A robust solution to overfitting is the concept of random forests (RFs), first proposed by Breiman (2001). An RF is an ensemble learning method that operates by building multiple decision trees and aggregating the results (see Figure 2). Multiple different training sets (termed ‘bootstrap samples’) are generated by sampling with replacement randomly from the original data. This method builds several instances of a decision tree that produces an output ŷ i corresponding to each tree. All individual outputs are then averaged to obtain the final prediction, ŷ.

Figure 2 Illustration of the RF approach showing the structure of individual trees

2.7 Gaussian process regression

A Gaussian process is a collection of random variables of which any finite number follows a joint Gaussian distribution (Williams and Rasmussen, 1996). Gaussian process regression (GPR) provides a method for performing Bayesian inference about functions in a non-parametric way. One of the key aspects of GPRs is the use of covariance functions that encode prior assumptions about the functions that one wishes to learn (in this case the measured data). This avoids reliance on algebraic mapping between inputs and outputs. The overall aim of the process is to learn a regression model of the form y = f(x) + ϵ, where f(x) is a latent function representing the underlying structure of the data and ϵN(0, σ 2) is a Gaussian noise term where σ 2 is the variance of the noise (the symbol ‘∼’ means ‘distributed according to’). A Gaussian process can be completely described by a mean vector, μ ( x ), and covariance function k( x , x ) of input pairs x and x to describe an underlying real process f(x) as follows:

3

where

$μ ( x ) = E [ f ( x ) ]$
4
$k ( x , x ′ ) = E { [ f ( x ) − μ ( x ) ] [ f ( x ′ ) − μ ( x ′ ) ] T }$
5
2.8 Support vector machine/regression

The term ‘support vector regression’ (SVR) denotes the application of support vector machines (SVMs) to regression problems. The ε-insensitive approach first proposed by Vapnik (1995) is one of the most widely adopted SVM/SVR approaches in the literature. SVR uses either linear or non-linear kernels to map the input space into a high-dimensional feature space. The most common kernel adopted for this purpose is the radial basis function (RBF):

$k ( x i , x ) = exp ( – y SVM ‖ x i – x j ‖ 2 )$
6

where γ SVM is a kernel coefficient. A hyperplane is subsequently constructed in the feature space where the quality of fit to the data is computed using an ε-insensitive cost function (L ε (y)) defined as follows (see Figure 3):

7

where x is the input data with target values y; f(x) is the regression function; and ε is a user-defined positive value representing the maximum distance between f(x) and y for which there is no loss in the cost function. According to Equation 7, only predictions that have residuals greater than ε are penalised, while predictions with smaller residuals have no effect on the regression equation. Considering a linear function as an example, f( x ) can be defined as follows:

$f ( x ) = ( w ⋅ x ) + b$
8

where w is an adjustable weight vector and b is the bias. The objective is to obtain a function that has the smallest ε deviation from the target values in the training data and is also as ‘flat’ as possible (by minimising the Euclidean norm || w ||2).

Figure 3 Illustration of ε-insensitive SVR with slack variables

2.9 Extreme learning machine

Extreme learning machine (ELM) is a three-layer neural network – that is, it comprises a single hidden layer (Huang et al., 2004). The novelty of ELM centres around its use of randomly generated hyperparameters for the hidden layer, which are not updated during training, unlike conventional neural networks (Huang et al., 2006). This significantly reduces the computational time associated with the learning process and increases the ability of the network to generalise within the trained parameter space. The ELM training process involves the generation and selection of random numbers for the weight and bias matrices for the hidden layer (Huang et al., 2011). Since the number of neurons in the hidden layer is typically much less than the number of training observations, the network is an overdetermined linear system. A consequence of this is that the output weight matrix is the only parameter that needs to be optimised during training, which can be undertaken using an ordinary least-squares approach.

2.10 Particle swarm optimisation
Particle swarm optimisation (PSO) is an optimisation algorithm developed by Kennedy and Eberhart (1995). This approach attempts to mimic interactions in groups of social beings and the sharing of information between the group members (termed ‘particles’). Rather than using a single particle to search for an optimal solution, the whole population is used where the velocities of each member are defined by both a stochastic and a deterministic component. While each particle moves randomly, it is partially guided by its own (local) best position, as well as the best position of the group (global). The updated velocity vector at time t + 1 for particle i (
$v i t + 1$
) is defined as follows:
$v i t + 1 = v i t + β g ρ 1 ( g * − x i t ) + β l ρ 2 ( x i * − x i t )$
9
where ρ 1 and ρ 2 are two randomly initiated vectors with entries ranging between 0 and 1; α and β are the global and local learning parameters, respectively; x i is the position of particle i; and g * and
$x i *$
are the global and local (for particle i) best historical locations, respectively.
2.11 Evolutionary algorithms

First proposed by Holland (1992), genetic algorithms (GAs) are arguably the most popular variant of evolutionary algorithm. These methods are a computational model inspired by evolution and the mechanisms of natural selection and are typically deployed as search and optimisation algorithms. The parameters of the user-defined search space are first encoded in the form of chromosomes, which can in turn be grouped to form a population. The process begins by initiating a random population representing different nodes in the search space. The fitness (cost) function is then evaluated for each node to determine the fitness value. New search nodes are randomly generated by applying genetic operations on the nodes based on their fitness values. This process is repeated until an optimal solution is acquired. The purpose of the genetic operators is to combine the ‘good’ structures of each node to produce an improved search node. Common genetic operators are shown in Figure 4 and include (a) cross-over (portions of chromosomes are swapped), (b) reproduction (chromosomes with good fitness values in an old population are preserved in the new population) and (c) mutation (occasional random alteration of a chromosome).

Figure 4 Overview of genetic operators used for the GA optimisation procedure

Three alternative evolutionary algorithms include (a) genetic programming (GP), (b) gene expression programming (GEP) and (c) differential evolution (DE). The fundamental difference between these approaches lies primarily in the composition of the individuals within the respective populations. In GAs, individuals are linear chromosome strings of fixed length; in GPs, they are non-linear units with varying shapes and sizes; in GEPs, they are encoded linear strings of fixed length (similar to GA chromosomes), which are subsequently expressed as non-linear units of varying shapes and sizes; and in DEs, they are real vectors rather than binary chromosome strings.

2.12 Imperialist competitive algorithm

The imperialist competitive algorithm (ICA) is an alternative evolutionary search and optimisation algorithm proposed by Atashpaz-Gargari and Lucas (2007) and is derived from human beings’ sociopolitical evolution. In this case, the initial population is termed ‘countries’ and is broken into two categories: (a) colony and (b) imperialist state. A cost function is used to determine which countries of the initial population are the most ‘powerful’ and are therefore selected as imperialist states. The remaining countries are assigned as colonies of the imperialist states depending on the value of the cost function for each imperialist state. The imperialist state and their respective colonies are denoted an empire. The ensuing optimisation process is described by Figure 5.

Figure 5 Overview of ICA implementation

3.1 Overview

A wide range of ML techniques have been developed for tunnelling applications. Research areas have included TBM automation (Mokhtari and Mooney, 2019), tunnel condition assessment (Chen et al., 2019a; Li et al., 2017; Zhu et al., 2020), anomaly detection (e.g. Sheil et al., 2020; Yu et al., 2018), tunnel profile measurement (e.g. Xue and Zhang, 2019), resilience assessment (e.g. Khetwal et al., 2019), structural defect identification (e.g. Ding et al., 2019), tunnel face stability (e.g. Hayashi et al., 2019), rockburst prediction (e.g. Liu and Hou, 2019) and intelligent building information modelling (e.g. Zhao et al., 2019a). This review focuses on four tunnelling applications where the use of ML has been most prevalent: (a) TBM performance prediction, (b) tunnel-induced settlement prediction, (c) geological forecasting and (d) cutterhead design optimisation.

3.2 TBM performance prediction

A large body of research has focused on the development of improved TBM performance predictions by leveraging recent advances in ML. Table 1 presents an overview of these studies where the corresponding parameters and notation are defined in Figure 6 (a slurry pressure balance shield machine is shown for illustrative purposes) and Table 2. Research into TBM performance has been largely confined to open-mode TBM tunnelling in rock with only a handful of efforts with slurry or earth pressure balance (EPB) shield TBMs in softer soils (e.g. Mokhtari and Mooney, 2020; Mokhtari et al., 2020; Mooney et al., 2018). Mooney et al. (2018) note that maintenance of EPB, TBM guidance using thrust and articulation jacks, scraping and imbibing of the in situ ground and muck processing through a depressurising screw conveyor combine to make performance prediction of EPB TBMs particularly challenging.

Figure 6 Definition of tunnel geometry, TBM and soil parameters adopted in this study (note that pressure balance shield machine is shown for illustrative purposes)

 Table 1 Summary of previous studies exploring the application of ML algorithms to the prediction of TBM performance

Table 1 Summary of previous studies exploring the application of ML algorithms to the prediction of TBM performance

Reference ML algorithma Featuresb Predictandc TBM typed Data set and size
Grima et al. (2000) ANFIS Core fracture frequency, UCS, RPM, thrust/cutter, D c PR 640 – various tunnels worldwide
Benardos and Kaliampakos (2004) ANN (8-9-4-1) Rock mass fracture degree, RMW, SF, RMQ, UCS, H, WT, k AR O 11 – Athens metro tunnel (Greece)
Simoes and Kim (2006) FL (rule- and parametric-based) D, RQD, RMR, water inflow rate Utilisation O Milyang tunnel (South Korea) Queens water tunnel (USA) Manapouri tunnel (New Zealand)
Mohammadi et al. (2007) RBF-ANN (8-?-1) RQD, UCS, SF, WT, RMW, RMR, H, k AR O 11 – Athens metro tunnel (Greece)
Zhao et al. (2007) Ensemble neural network UCS, DPW, α, BI SBI EPB 47 – the Deep Tunnel Sewerage System (Singapore)
Acaroglu et al. (2008) FL UCS, BTS, D c, t c, s c, penetration SER Linear cutting tests
Benardos (2008) ANN (8-9-5-1); ANN (8-9-4-1) RQD, RMW, SF, RMR, UCS, H, WT, k PR O 330 – Maen tunnel 301 – Pieve tunnel (Italy) 11 – Athens metro tunnel (Greece)
Mikaeil et al. (2009) Multifactorial fuzzy evaluation UCS, BTS, PSI, DPW, α PR O 151 – Queens water tunnel (USA)
Yagiz et al. (2009) ANN (4-8-1); non-linear multivariate regression UCS, BI, DPW, α PR O 151 – Queens water tunnel (USA)
Gholamnejad and Tayarani (2010) ANN (3-9-7-3-1) UCS, RQD, DPW PR O 185 – Queens water tunnel (USA), Karaj–Tehran water tunnel (Iran), Gilgel Gibe II tunnel (Ethiopia)
Yagiz and Karahan (2011) PSO UCS, BI, DPW, α, TBM field data PR O 151 – Queens water tunnel (USA)
Maher (2013) Linear regression; polynomial regression (degree = 3); SVR (linear, polynomial kernels) 12 TBM parameters PR EPB Seattle subway tunnel (USA)
Oraee et al. (2012) ANFIS RQD, UCS, DPW PR O 177 – Queens water tunnel (USA), Gilgel Gibe II tunnel (Ethiopia)
Ge et al. (2013) Least-squares SVM UCS, BTS, PSI, DPW, α PR O 151 – Queens water tunnel (USA)
Ling et al. (2013) Partial least-squares FNN UCS, BTS, PSI, DPW, α PR O 151 – Queens water tunnel (USA)
Martins and Miranda (2013) ANN (4-2-1) SVR (RBF kernel) UCS, PSI, DPW, α PR O 151 – Queens water tunnel (USA)
Mobarra et al. (2013) ANN (4-13-4-1) UCS, PLS, RPM, normal force designation PR O 289 – Golab water tunnel (Iran)
Salimi and Esmaeili (2013) Linear regression Non-linear multiple regression ANN (5-17-10-1) UCS, BTS, PSI, DPW, α PR O 46 – Karaj–Tehran water tunnel (Iran)
Shao et al. (2013) Online prediction with ELM incremental learning UCS, BTS, PSI, DPW, α PR O 151 – Queens water tunnel (USA)
Špačková and Straub (2013) Dynamic Bayesian networks Ground zone, rock class, H, ground class, human factor, project geometry, CM, failure mode, number of failures Time Numerical modelling of Suncheon–Dolsan tunnel (South Korea)
Ghasemi et al. (2014) FL UCS, BTS, BI, DPW, α PR O 151 – Queens water tunnel (USA)
Mahdevari et al. (2014) SVR (RBF kernel) JF, α, UCS, BTS, CP, DPW, SE, BI PR O 151 – Queens water tunnel (USA)
Salimi et al. (2015) ANN (2-4-1); ANFIS; SVR (RBF kernel) UCS, DPW PR O 75 – Zagros water conveyance tunnel (Iran)
Tao et al. (2015) RF UCS, BTS, BI, DPW, α PR O 151 – Queens water tunnel (USA)
Yagiz and Karahan (2015) DE Hybrid harmony search Grey wolf optimiser UCS, BI, DPW, α PR O 151 – Queens water tunnel (USA)
Fattahi (2016) FCM-ANFIS UCS, BI, DPW, α PR O 151 – Queens water tunnel (USA)
Salimi et al. (2016) ANFIS SVR (RBF kernel) UCS, DPW PR O 75 – Zagros water conveyance tunnel (Iran)
Adoko et al. (2017) Bayesian inference UCS, BI, DPW, α PR O 151 – Queens water tunnel (USA)
Armaghani et al. (2017) ANN (7-11-1); PSO-ANN (7-11-1); ICA-ANN (7-11-1) UCS, BI, RQD, RMR, RMW, JF, RPM PR O 1286 – PSRWTe tunnel (Malaysia)
Fattahi and Babanouri (2017) DE-SVM; artificial bee colony SVM; gravitational search SVM UCS, BI, DPW, α PR O 151 – Queens water tunnel (USA)
Minh et al. (2017) FL UCS, BTS, BI, DPW, α PR O 151 – Queens water tunnel (USA)
Mooney et al. (2018) SVR (RBF kernel); RReliefF feature selection) JF, CP, Q F, H, H w AR EPB Seattle University link tunnel (USA)
Armaghani et al. (2018) GEP UCS, BTS, RQD, RMR, RMW, JF, RPM PR O 1286 – PSRWT tunnel (Malaysia)
Mikaeil et al. (2018) Multifactorial fuzzy evaluation approach UCS, PSI, DPW, α PR O 151 – Queens water tunnel (USA)
Adoko and Yagiz (2019) FCM clustering; subtractive clustering; ANFIS; knowledge-based fuzzy inference Rock type, UCS, BI, α, DPW, JF FPI O 151 – Queens water tunnel (USA)
Armaghani et al. (2019) PSO-ANN (8-12-1); ICA-ANN (8-12-1) UCS, BTS, RMR, RQD, q, RMW, JF, RPM AR O 1286 – PSRWT tunnel (Malaysia)
Cachim and Bezuijen (2019) Time series ANN Foam injection ratio, lagged values of torque T EPB Botlek rail tunnel (the Netherlands)
Gao et al. (2019) RNN; long-short term memory networks; gated recurrent networks 44 TBM parameters T, velocity, JF, P c EPB Shenzhen subway tunnel (China)
Koopialipoor et al. (2019a) Group method of data handling UCS, BTS, RQD, RMR, RMW, JF, RPM PR O 1286 – PSRWT tunnel (Malaysia)
Koopialipoor et al. (2019b) ANN (5-8-32-8-1) UCS, BTS, RQD, RMR, RMW PR O 1286 – PSRWT tunnel (Malaysia)
Naghadehi et al. (2019) ICA-GEP UCS, BTS, BI, DPW, α PR O 151 – Queens water tunnel (USA)
Salimi et al. (2019) CART; GP UCS, RQD, DPW, joint condition FPI Various tunnels worldwide
Shi et al. (2019a) FCM clustering; attribute correlation guided FCM clustering 53 TBM parameters PR EPB Tunnel in China
Song et al. (2019) Time series segmentation guided by FCM clustering 53 TBM parameters PR EPB Tunnel in China
Xu et al. (2019) kNN; SVR (RBF kernels); ANN (6-?-1); CART; chi-squared automatic UCS, BTS, RQD, RMW, JF, RPM PR O 1286 – PSRWT tunnel (Malaysia)
Zhou et al. (2020) ANN (6-2-1); GP UCS, RQD, RMR, BTS, JF, RPM AR O 1286 – PSRWT tunnel (Malaysia)
Koopialipoor et al. (2020) Hybrid firefly-ANN (7-8-1) UCS, BTS, RQD, RMR, RMW, JF, RPM PR O 1286 – PSRWT tunnel (Malaysia)
Mokhtari et al. (2020) Elastic net regression RPM, JF, Q F, CP, SCP, H AR EPB Seattle Northlink Extension tunnel (USA)
Mokhtari and Mooney (2020) SVR (RBF kernel); RReliefF feature selection) RPM, JF, Q F, CP, SCP AR EPB Seattle Northlink Extension tunnel (USA)

aNN, fuzzy neural networks; kNN, k-nearest neighbours

bCM, construction method; CP, cutterhead power; H, cover from surface to TBM; JF, cutterhead jacking force; Q F, conditioning foam flow rate; RPM, cutterhead rotation speed; SCP, screw conveyor power; SE, specific energy; SF, stability factor; H w, height of the groundwater table above TBM; WT, elevation of the groundwater table

cAR, advance rate; FPI, field penetration index; PR, penetration rate; SBI, specific rock mass boreability index; SER, specific energy requirement

dEPB, earth pressure balance; O, open-mode hard rock

ePSRWT, Pahang–Selangor Raw Water Transfer

 Table 2 Definitions of rock parameters used in Table 1

Table 2 Definitions of rock parameters used in Table 1

Parameter Definition
UCS Unconfined compressive strength
DPW Distance between planes of weakness
α Joint orientation
BTS Brazilian tensile strength
BI Brittleness index
RQD Rock quality designation
RMW Rock mass weathering
RMR Rock mass rating
PSI Point strength index
RMQ Rock mass quality
q Quartz content

From Table 1, it is notable that penetration rate (PR) is the most favoured measure of TBM performance, defined as the penetration along the axis of the tunnel per unit tunnelling time (i.e. downtimes/stoppages are not included in the calculation). It can be observed that the input parameters are dominated by ground (rock) properties, with unconfined compressive strength (UCS) being the most common. For example, the study by Benardos and Kaliampakos (2004) was one of the earliest ones to use rock mass properties (e.g. UCS, rock mass rating, weathering) as inputs to an ANN for TBM performance prediction where an error of 6–8% was obtained. For softer soils, Mooney et al. (2018) noted that TBM performance was most influenced by cutterhead torque, foam flow rate and screw conveyor rotation speed. It is noteworthy that the selection of input parameters has been predominantly guided by empiricism from previous literature and the application of more robust ‘feature engineering’ techniques in this area has been limited. Using principal component analysis, Salimi et al. (2015, 2016, 2019) confirmed the strong dependence of TBM performance on rock mass parameters (e.g. UCS, rock quality designation, joint spacing and condition) for hard-rock tunnelling.

Another interesting observation is the inusitation of TBM operational (e.g. jacking force (JF), cutterhead torque (T), cutterhead rotation speed (RPM), slurry parameters) and geometric parameters (e.g. tunnel diameter, distance from reception shaft, soil cover) as features. This is because many of the training data sets relate to a single construction project and it is a common assumption that TBM and geometric parameters remain constant during a given project and so should not be included in the ML. While this provides good predictability on a case-by-case basis (where one might wish to forecast the performance of the TBM for the current project based on the data gathered thus far), it limits the applicability of these trained ML models to other projects. This is particularly important in the case of ML models, as they typically demonstrate a poor ability to extrapolate beyond their calibration space (Ahmed et al., 2010). Recent studies incorporating the influence of TBM operational parameters for performance prediction have demonstrated an improved ability to generalise, for example, to alternative excavation techniques (Song et al., 2019).

The most common ML technique adopted for the prediction of TBM performance is a multilayer feedforward ANN with back-propagation. The main difference between the ANN models adopted in the literature is the optimal ANN architecture that was ultimately selected. Even though similar input parameters and data sets have been employed across various studies, the range of architectures that have been adopted is quite wide. For example, Armaghani et al. (2017) and Koopialipoor et al. (2019b) adopted 7-11-1 (n f-n h1-n o) and 5-8-32-8-1 architectures, respectively, for the prediction of the same data set (the Pahang–Selangor raw water transfer tunnel (PSRWT)). It is noteworthy that the use of several hidden layers and neurons increases the likelihood of encountering overfitting. Hecht-Nielsen (1987) proved that any continuous function can be represented by a neural network using a single layer with n h1 = 2n f + 1 nodes, albeit using significantly more complex activation functions than the conventional sigmoidal functions commonly adopted in the literature. This corresponds to architectures of 7-15-1 and 5-11-1, respectively, for these studies.

Other popular ML methods adopted in the literature include FL, due to its ability to incorporate empirical evidence/experience and, recently, more flexible and non-linear ML algorithms such as CARTs (e.g. Xu et al., 2019) and RFs (e.g. Tao et al., 2015). To develop improved methods for the determination of the optimum architecture and the avoidance of local minima, hybrid methods have also been explored by fusing ML models with optimisation algorithms such as ICA (e.g. Naghadehi et al., 2019), PSO (e.g. Armaghani et al., 2018), DE (e.g. Fattahi and Babanouri, 2017) and FCM (e.g. Fattahi, 2016).

3.3 Tunnelling-induced settlement prediction

Table 3 presents an overview of ML models adopted for the prediction of tunnelling-induced soil settlements, s, as well as tunnel convergence, C. Given the complex nature of tunnelling-induced settlements, the number of features used in these models is notably greater. Furthermore, these features comprise a mix of soil, tunnel geometry and TBM operational parameters. For the studies considered in this review, ANNs appear to have been the ML model of choice pre-2012, although they continue to appear in more recent literature. It is again apparent that a wide range of architectures have been explored from the 47-47-47-47-2 architecture adopted by Kim et al. (2001) to the more compact 3-4-1 architecture proposed by Hasanipanah et al. (2016) and Moghaddasi and Noorian-Bidgoli (2018).

 Table 3 Summary of previous studies exploring the application of ML algorithms to the prediction of tunnel-induced settlements

Table 3 Summary of previous studies exploring the application of ML algorithms to the prediction of tunnel-induced settlements

Reference ML algorithma Salient featuresb Predictandc Data set and sized
Shi et al. (1998) ANN (8-24-1) L, H, A, delay in closing inverted arch, WT, AR, CM, SPT values S max 356 – Brasilia tunnel (Brazil)
Kim et al. (2001) ANN (47-47-47-47-2) 47 tunnel, TBM and soil parameters S max, i 113 – Seoul subway tunnel (South Korea)
Neaupane and Adhikari (2006) ANN (6-3-9-1) H, D, s u, V L, CM, WT S max 26 – various projects worldwide
Neaupane and Adhikari (2006) ANN (6-3-5-1) H, D, s u, V L, CM, WT, S max U max 26 – various projects worldwide
Suwansawat and Einstein (2006) ANN (10-20-1) H, L, geology at crown and invert, WT, P f, PR, θ, P g, V g S max 49 – Bangkok MRT tunnel (Thailand)
Yoo and Kim (2007) ANN (8-4-4) H, WT, support pattern, geologies, soil layer thicknesses S max, C, lining stresses 95 – high-speed railway tunnel (South Korea)
Santos and Celestino (2008) ANN (14-12-6-1) 14 parameters: tunnel geometry and ground conditions S max 81 – São Paulo subway tunnel (Brazil)
Boubou et al. (2010) ANN (11-7-7-1) AR, T, P f, P g, V g, JF, time, steering deviations, total work, X/H S(X) 432 – Toulouse subway tunnel (France)
Franza et al. (2018) ANN (5-4-2) H, R d, V L, X, H S(X,Z), U(X,Z) Centrifuge test data
Goh and Hefney (2010) ANN (8-5-1) H, AR, P f, SPT at crown and springline, wc, E s, P g S max 148 – MRT tunnel (Singapore)
Kongsomboon et al. (2010) ANN (14-15-15-1) H, D, L, Y, geology, WT, P f, PR, θ, P g, V g U max 38 – Chaloem Ratchamongkol MRT and Bangkok water conveyance tunnels (Thailand)
Qiao et al. (2010) ANN (13-20-1) H, L, geologies, WT, P f, PT, θ, P g, V g S max 49 – Bangkok MRT tunnel (Thailand)
Tsekouras et al. (2010) ANN (5-11-3) Stability, lining placement, t l, E l, Y S max, U max, V max 7650 – finite-difference analyses
Ninic et al. (2011) ANN (6-14-1) P f, P g, κ, H, X, Y S(X,Y) 2160 – finite-element analyses
Darabi et al. (2012) ANN (5-17-1) c′, ϕ′, E s, H, D S max 50 – various projects in Iran and Turkey
Li et al. (2012) PSO-SVM with chaotic mapping C h 39 – Xiakeng tunnel (China)
Mahdevari and Torabi (2012) ANN (9-35-28-1); RBF-ANN (9-35-28-1) H, GSI, RQD, compressive and tensile strength, c′, ϕ′, E s, UCS C ave 60 – Ghomroud water tunnel (Iran)
Mahdevari et al. (2012) SVM (RBF kernels); ANN (9-35-28-1) H, GSI, RQD, compressive and tensile strength, c′, ϕ′, E s, UCS C ave 60 – Ghomroud water tunnel (Iran)
Marto et al. (2012) ANN (9-24-1) H, SPT, wc, c′, ϕ′, E s, γ, ν s, Y S(Y) 160 – Karaj urban railway tunnel (Iran)
Pourtaghi and Lotfollahi-Yaghin (2012) Wavelet-ANN (13-10-1) H, L, geologies, WT, P f, PR, P g, θ, V g S max 49 – Bangkok MRT tunnel (Thailand)
Rafiai and Moosavi (2012) ANN (11-7-4-2) R, rock stresses, c′, ϕ′, E s, ν s, ψ′, t l, E l, ν l C h, C v 2500 – finite-difference analyses
Adoko et al. (2013) MARS; ANN (8-20-26-1) Rock class rating index, c′, ϕ′, E s, γ, H, Y, time C ave 390 – CKTJ-9 high-speed railway tunnel (China)
Khatami et al. (2013) ANN (6-15-1) Building EI, width, weight. Distance between tunnels, H, X Building settlement 160 – finite-element analyses
Mahdevari et al. (2013) SVM (RBF kernels) wc, γ, c′, ϕ′, E s, k sub C ave 75 – Amirkabir tunnel (Iran)
Ninić et al. (2013) PSO-ANN (6-20-1) E s, K, P f, Y, X, P g S(X,Y) 625 – finite-element analyses
Ocak and Seker (2013) ANN (18-9-1); SVM (RBF kernels); GPR (SQ kernels) 18 tunnel, TBM and soil parameters S 230 – Istanbul metro tunnel (Turkey)
Bouayad and Emeriault (2014) ANFIS combined with PCA 6 TBM and soil parameters S 432 – Toulouse subway tunnel (France)
Guo et al. (2014) Elman-type PSO-RNN (4-20-1) H, JF, P f, V g S Jiangji subway tunnel (China)
Ahangari et al. (2015) ANFIS; GEP H, D, c′, ϕ′, E s S max 53 – finite-difference analyses
Behnia and Shahriar (2015) GEP H, D, c′, ϕ′, E s S max 50 – finite-difference analyses
Bouayad et al. (2015) Partial least-squares regression combined with agglomerative hierarchical clustering 11 TBM and tunnel geometry parameters S 432 – Toulouse subway tunnel (France)
Khamesi et al. (2015) Fuzzy systems coupled with (a) PSO, (b) ICA and (c) nearest-neighbourhood clustering K, E s, s u, soil mass number S 240 – Karaj subway tunnel (Iran)
Koukoutas and Sofianos (2015) ANN (14-15-1) H, WT, geologies, JF, P f, PR, T, P g, V g, excavated material S max 584 – Athens extension (317), Thessaloniki metro tunnels (267; Greece)
Mohammadi et al. (2015) ANN (6-14-1) H, soil type, γ, c′, ϕ′, E s S max 17 – Niayesh highway tunnel (Iran)
Dindarloo and Siami-Irdemoosa (2015) CART H, D, V L, normalised V L, s u, WT, CM S max 34 – tunnels from the UK, the USA, Canada, Thailand, Brazil and Germany
Cao et al. (2016) RNN combined with Gappy POD E s, P g S 60 – finite-element analyses
Hasanipanah et al. (2016) PSO-ANN (3-4-1) K, s u, E s S max 143 – Karaj subway line 2 tunnel (Iran)
Lai et al. (2016) ANN (9-?-2) H, D, c′, ϕ′, E s, P g , V g , JF, PR S max, trough width coefficient 6 – three tunnel projects in China
Wang et al. (2016) Relevance vector machine (RBF kernels) H, Y, P f, P g, AR, geologies, lagged settlement measurements S 182 – Wuhan metro line 2 tunnel (China)
Zhou et al. (2016) RF (500 trees) H, D, c′, ϕ′, E s, P g , V g , JF, PR S max 26 – Shanghai, Guangzhou and Nanjing tunnels (China)
Bouayad and Emeriault (2017) ANFIS coupled with PCA and agglomerative hierarchical clustering 6 TBM and soil parameters S 432 – Toulouse subway tunnel (France)
Kohestani et al. (2017) RF (270 trees) H, L, geologies, P f, PR, θ, P g, V g S max 49 – Bangkok MRT tunnel (Thailand)
Naeini and Khalili (2017) ANFIS H, D, c′, ϕ′, E s S max 46 – subway tunnels in Iran and Turkey
Zhang et al. (2017) Wavelet least-squares GA-SVM (RBF kernels) Measured settlement time histories S 60 – Wuhan metro line 3 tunnel (China)
Zhou et al. (2017) RF (500 trees) Set A: H, D, V L, normalised V L, s u, WT, CM Set B: H, D, c′, ϕ′, E s, P g, V g, JF, AR S max, trough width coefficient 66 – various tunnels worldwide
Fattahi and Babanouri (2018) Rock engineering systems H, L, WT, P f, PR, θ, P g, V g, geologies S max 49 – Bangkok MRT tunnel (Thailand)
Goh et al. (2018) MARS H, AR, P f, SPTs, wc, E s, P g S max 148 – three MRT tunnels in Singapore
Mehrnahad and Zekrabad (2018) ANN (7-24-1) L, H, γ, E s, c′, ϕ′, P f S max 181 – Mashhad metro line 2 tunnel (Iran)
Moeinossadat et al. (2018a) ANFIS H, D, H/D, c′, ϕ′, E s, P g, V g, JF, PR S max 41 – Shanghai subway line 2 tunnel (China)
Moeinossadat et al. (2018b) ANFIS; GEP; neuro-genetic systems H, D, c′, ϕ′, E s, P g, V g, JF, AR S max 41 – Shanghai subway line 2 tunnel (China)
Moghaddasi and Noorian-Bidgoli (2018) ICA-ANN (3-4-1) K, s u, E s S max 143 – Karaj subway line 2 tunnel (Iran)
Sun et al. (2018a) Multiclass SVM (RBF kernels) D, H, support stiffness, rock tunnelling quality index C ave 117 – various tunnels worldwide
Chen et al. (2019a) ANN; RBF-ANN; general regression network JF, T, P f, PR, V g, H, WT, modified SPT, modified DPT, modified UCS S max 200 – Changsha metro line 4 tunnel (China)
Chen et al. (2019b) ANN; wavelet-ANN; general regression ANN; ELM; SVM; RF JF, T, P f, PR, V g, H, WT, modified SPT, modified DPT, modified UCS S max 200 – Changsha metro line 4 tunnel (China)
Fattahi and Bayatzadehfard (2019) ANFIS with subtractive clustering; FCM-ANFIS; ANFIS with biogeography-based optimisation H, L, WT, P f, PR, θ, P g, V g S max 49 – Bangkok MRT tunnel (Thailand)
Hajihassani et al. (2019) GEP H, c′, ϕ′, γ, ν s, E s C ave 118 – Karaj urban railway line 2 tunnel (Iran)
Hu et al. (2019) PSO-ANN; PSO-SVR; PSO-ELM Measured settlement time histories S 70 – Zhuhai tunnel (China)
Liu and Liu (2019) GA-GPR (SE + RQ kernels); GA-SVM (RBF kernels) c′, ϕ′, ν s, E s, K, Y C h, C v Finite-difference analyses
Moeinossadat and Ahangari (2019) GEP c′, ϕ′, γ, ν s, E s, K, H, P f, surface surcharge S max 100 – finite-difference analyses
Ramezanshirazi et al. (2019) ANN (15-30-1) 15 geometric, TBM and soil parameters S max Milan M5 metro tunnel
Saadallah et al. (2019) Vector autoregressive with exogenous variables S 160 – finite-element analyses
Shi et al. (2019b) SVM with information granulation using two-layer perceptron kernel C ave Panlongshan tunnel (China)
Zhang et al. (2019a) RF (91 trees) JF, T, P f, PR, V g, H, WT, modified SPT, modified DPT, modified UCS, ground condition, stoppages S max 294 – Changsha metro line 4 tunnel set A (China)
Zhang et al. (2019a) RF (38–153 trees) JF, T, P f, PR, V g, H, WT, modified SPT, modified DPT, modified UCS, ground condition S max 265 – Changsha metro line 4 tunnel set B (China)
Zhang et al. (2019b) SVM H, H/L, geologies S max 500 – Huquan–Yangjiawan section of Wuhan metro tunnel (China)
Zhu et al. (2019) Bayesian networks Seasonal parameters S 2762 – Shanghai metro line 1 (China)
Hajihassani et al. (2020) PSO-ANN (8-12-1) H, AR, SPT, c′, ϕ′, γ, ν s, E s S max, i t, i l 123 – Karaj urban railway line 2 tunnel (Iran)
Yan et al. (2020) ANN-SVR-ELM ensemble algorithm Measured settlement time histories S max 70 – Zhuhai tunnel (China)
Zhang (2020) MARS H, AR, P f, SPTs, wc, E s, P g S max 148 – three MRT tunnels in Singapore
Zhang et al. (2020) ANN-SVR-MARS ensemble algorithm using XGBoost H, AR, P f, SPTs, wc, E s, P g S max 148 – three MRT tunnels in Singapore

aMARS, multivariate adaptive regressive splines; PCA, principal component analysis; POD, proper orthogonal decomposition

bSPT, standard penetration test; wc, soil water content; GSI, geological strength index; k sub, modulus of subgrade reaction; κ, slope of soil unload–reload curve

c i t, i l, transverse and longitudinal inflection points, respectively; u, horizontal movement

dMRT, mass rapid transit

DPT, dynamic penetration test; RQ, rational quadratic; SE, squared exponential

While the integration of fuzzy systems has also been used to predict tunnel-induced settlements, the use of SVMs became popular post-2012, quickly followed by more complex and non-parametric methods such as CARTs and RFs. The prominence of these methods for settlement prediction is perhaps explained by the increased complexity of the input–output mapping process for tunnelling-induced settlements. The data sets used for predicting tunnel-induced settlements are also largely based on a single project rather than multiple projects, with the size of the data set varying considerably (from 6 to 7650 data points).

3.4 Geological forecasting

Efforts to predict ahead of the TBM involve identification of geological conditions, as well as the size and location of potential obstacles (Schaeffer and Mooney, 2016). In these cases, it is desirable to identify changes in soil conditions as shown in Figure 7. To obtain actionable information during tunnelling, soil conditions must be forecasted sufficiently far in advance of the TBM (typically metres to tens of metres). This is complicated by a deterioration in the accuracy of forecasting techniques with an increase in the forecast horizon.

Figure 7 Definition of geological identification and forecasting problem

One approach is to consider the TBM itself as an exploratory tool. A popular implementation of this approach is to use statistical interpolation techniques first (such as kriging) to develop an initial estimate of the ground conditions at the TBM face using available borehole information as shown in Figure 8 (Gangrade and Mooney, 2019; Grasmick et al., 2020). These predictions are subsequently updated using TBM driving data to obtain a more reliable estimate of the ground immediately ahead of the TBM. This methodology was adopted by Yamamoto et al. (2003) and Sun et al. (2018b). In particular, Sun et al. (2018b) achieved a prediction accuracy of R 2 = 0.8 using RFs.

Figure 8 Geological interpolation by kriging documented by Sun et al. (2018b)

Alternatively, ML can also be used to provide a direct mapping between TBM performance parameters and ground conditions. This approach can be considered the inverse of the techniques reviewed for TBM performance prediction. Liu et al. (2019) used SVR combined with a stacked single-target technique to identify multiple targets from a common data set, such as UCS, brittleness index (BI), distance between planes of weakness (DPW) and α. This allowed correlation between targets to be incorporated into the prediction model. The driving data used to identify the target variables included RPM, PR, JF, T and cutterhead power, where a prediction accuracy of R 2 between 0.63 and 0.83 was achieved. It is notable that R 2 = 0.83 corresponded to the UCS prediction, indicating its strong correlation with TBM performance in rock. Zhang et al. (2019c) used SVM, RF and k-nearest neighbours to map RPM, T, JF and advance rate (AR) to rock mass type. Zhao et al. (2019b) compared the performance of eight ML models to predict geological type using feature augmentation to improve performance; a traditional ANN was found to provide the best performance. Jung et al. (2019) also used an ANN to predict the ground type from PR, JF and T with an accuracy of R 2 > 0.9. The PR parameter was found to be the most influential for predicting ground type, particularly across different sites. Liu et al. (2020) used a hybrid algorithm combining traditional ANNs with simulated annealing to predict rock parameters UCS, BI, DPW and α from RPM, T, JF and PR (R 2 between 0.66 and 0.85). Erharter et al. (2019a, 2019b) used ensemble LSTM networks to classify TBM data into rock behaviour types according to four geological ‘indicators’. Yu and Mooney (2020) employed multinomial logistic regression to characterise the fractional representation of four encountered soil types (sand, clay, silt, till deposits) by an EPB TBM. The regression model was trained using RPM, AR, chamber pressure, excavated soil mass, thrust force and 83 boring logs along the alignment.

Instead of using TBM operational parameters, Zhuang et al. (2019) used convergence displacements in rock to infer E s and ν s through inverse analysis. This involved the use of SVR that is optimised using multi-strategy artificial fish swarm algorithm (Mafsa). The Mafsa approach is an ensemble algorithm comprising DE, PSO, adaptive step size and phased vision strategy based on the artificial fish swarm algorithm to enhance the global search capability and improve convergence speed and optimisation accuracy.

While numerous geophysical methods have been explored for forecasting geological conditions ahead of the TBM face (e.g. electromagnetic methods, electrical methods, seismic reflection methods, infrared detection methods), very few studies have explored the integration of ML algorithms to improve geophysical predictions. Both Alimoradi et al. (2008) and Von and Ismail (2017) used an ANN to identify rock characteristics using ground parameters obtained from tunnel seismic prediction technology. Although Von and Ismail (2017) reported a prediction accuracy of R 2 = 0.85, they noted that the small data sets at the beginning of a project lead to less reliable predictions.

Wei et al. (2018) documented one of the most comprehensive applications of ML to a new ‘Tunnel Look-ahead Imaging Prediction System’ (Tulips). The Tulips imaging approach comprises three sets of GPR antennae (low frequency for long-range inspection and two high frequencies to identify small objects) and seismic imaging. The pipeline of their event detection and tracking method is outlined in Figure 9. An experimental campaign showed that buried obstacles can be successfully identified and tracked using this methodology. Those authors also recommended the development and application of more robust ML models to larger data sets including expert interpretations and ground prediction and TBM and geological exploration data.

Figure 9 Pipeline of the event detection and tracking approach proposed by Wei et al. (2018)

The final research area covered by this literature review is the optimisation of the cutterhead design (see Figure 10), which appears to have focused exclusively on tunnelling in rock. The literature in this area can be further categorised as an optimisation of the (a) cutter disc layout and (b) cutter disc geometry. For the cutter layout, the optimisation process has been typically undertaken to (a) minimise eccentric forces (and therefore moments) of the whole system by maximising cutterhead symmetry, (b) maximise excavation efficiency by ensuring that adjacent cutters score the tunnel face successively and (c) minimise excavation-induced stress on the cutterhead (e.g. Ji et al., 2016). Other common constraints include the following: (a) cutter discs must remain contained within the cutterhead, (b) cutter discs must not overlap, (c) cutter discs must not interfere with manholes, ‘buckets’ or joints in the cutterhead and (d) cutter disc positions should be easily accessible for maintenance (Rostami and Chang, 2017).

Figure 10 Overview of the cutterhead design

An example optimisation documented by Huo et al. (2010, 2011) using a multi-objective GA and co-evolutionary GA is presented in Figure 11. Those authors used three ‘base’ designs as the starting point for the optimisation to reflect current designs used in practice: a multispiral (Figure 11(a)), a ‘dynamic star’ (Figure 11(b)) and a stochastic pattern (Figure 11(c)). Another possible reason for the use of these base designs is that the results of the optimisation process were reported to be highly dependent on the initial cutter pattern. This was also discovered by Qi et al. (2013) using grey rational analysis (GRA). GRA is a form of grey system theory (proposed by Deng (1982)) and solves multiple attribute decision-making by combining the entire range of attribute values being considered for each alternative decision into a single value (Kuo et al., 2008). Those authors also found that the polar angle played a more important role on the cutter layout rather than the radial distance from the centre point of the cutterhead. Although not discussed in those studies, these findings suggest the occurrence of local optima in these optimisation problems. While multiple alternative optimisation algorithms exist (e.g. grid search, random search), Bayesian optimisation (Brochu et al., 2010) seems suitable for this problem given its robustness to local optima. This is due to its exploration against exploitation strategy: exploitation initially steers the search process into the direction of the local optima but exploration allows the algorithm to ‘escape’ from the local optimum towards finding an improved global optimum.

Figure 11 Optimal cutterhead layout obtained by Huo et al. (2011): (a) spiral; (b) dynamic star; (c) stochastic

On the geometric design of individual cutters, Xia et al. (2012, 2015) used GA and multi-objective and multi-geologic condition optimisation to optimise the (a) cutter cutting edge angle, (b) cutting edge width, (c) transition arc radius and (d) caulking ring width between bearings. The optimisation process sought to minimise the cutter bearing load.

This review has identified an increasing trend in the use of ML in the tunnelling space with a significant increase in 2019. It is likely that this trend will persist as advancements in ML continue to be translated into practical domains for routine use and more tunnelling data are shared with the academic community. ANNs have experienced sustained popularity in this area. This is not surprising as ANNs are one of the oldest ML paradigms and are able to capture complex non-linear relationships and generalise within the trained parameter space. The second most popular technique is SVR/SVM. The non-parametric nature of these models means that model complexity remains relatively unaffected by an increase in the number of features, and these models are therefore particularly suited to high-dimensional data sets. This may go some way to explaining their popularity, particularly for settlement predictions due to the larger number of influencing factors. These techniques have been typically coupled with optimisation algorithms to overcome the slow tuning process of the kernel hyperparameters.

The use of fuzzy-based methods such as ANFIS and FL in this area stems from their ability to incorporate human experience and their ability to deal with imprecise and noisy data typical of construction monitoring projects. These methods have not experienced the same growth, which is probably due to the increase in ‘big data’ in tunnelling that lends itself to training more robust algorithms. It is also apparent that there has been a significant and recent increase in the use of alternative ML algorithms such as GEP and RF. These models provide a higher level of performance for the sake of model interpretability and can therefore capture highly non-linear trends. The use of probabilistic ML techniques, such as Bayesian networks and GPR, for underground construction applications have become more popular in recent years – for example, the studies by Zhang et al. (2016), Wang et al. (2017), Chen et al. (2019d) and Zhu et al. (2019). These methods are well conditioned for dealing with noisy and incomplete data typical of a construction site and perform predictions within a principled framework. In light of this, they represent the most promising techniques for future applications of ML to inform tunnelling operations.

This paper has presented a comprehensive review of the literature exploring the use of ML to inform tunnelling operations. While ML has been used to inform a wide range of tunnelling applications, this review has identified four main areas of research – namely, TBM performance prediction, tunnelling-induced settlement prediction, geological forecasting and cutterhead design optimisation. Many studies have reported the successful application of ML techniques in tunnelling activities with high levels of accuracy. The most popular methods adopted in the literature include ANNs, SVM/SVR and fuzzy-based methods. A clear trend is evident in the use of ML in tunnelling, and this trend is likely to persist as the volume of data produced by modern TBMs continues to grow and the use of ML becomes more commonplace. In most instances, investigators have used empiricism (from previous literature) as the basis for the selection of model inputs where the number of features varies considerably across the literature. As the number of parameters captured by modern TBMs grows, identification of the most appropriate features for training ML models using robust techniques should be central to future research.

Despite its recent advances, ML in tunnelling remains a young field with many underexplored research opportunities. Some of these opportunities can be observed by contrasting the methods reviewed in this study with those adopted in other disciplines such as aerospace, healthcare, robotics and automated vehicles (Mooney et al., 2020). In particular, there is a real need for continued application of ML methods employing more principled, probabilistic frameworks such as Bayesian networks and GPR. The problems covered by this review appear well suited to probabilistic frameworks given the uncertain nature of tunnelling operations and the prevalence of noisy data. This relieves engineers of onerous data preprocessing to denoise large training data sets. Furthermore, probabilistic frameworks provide a robust treatment of overfitting, meaning large data sets are not necessarily a prerequisite and deployment of these techniques on a site-specific basis is feasible.

Another important finding of this review is that most of the studies reviewed here have been developed and validated against a single case history. Validation of these algorithms across a broader parameter space is warranted for the industry to gain confidence in these approaches. As tunnelling data become more accessible, it may also become feasible to interrogate large data sets intelligently for the most appropriate training data for a given project. This would allow the relative performance of ML techniques on future projects to feedback into the improvement of the training data sets. In addition, the high-risk nature of mistakes in the tunnelling industry means model interpretability is essential for take-up in practice to gain insight into the features driving predictions.

Graphical causal inference represents an exciting area for future research. Several authors have argued that some of the most challenging open problems of ML and AI are intrinsically related to causality – for example, Pearl (2000, 2014) and Schölkopf (2019). In particular, the ML models reviewed in this paper suffer from a lack of generalisation (e.g. transfer to new problems). This is because these models are trained only on the most relevant information to limit the associated computational cost. However, information essential for generalisation, such as interventions, domain shifts and temporal structure, is typically neglected. Schölkopf (2019: p. 1) argues that ‘causality, with its focus on modelling and reasoning about interventions, can make a substantial contribution towards understanding and resolving these issues and thus take the field to the next level’. The integration of causal modelling in ML thus represents a promising avenue for more robust treatment of uncertainty in practical domains. It appears essential for the tunnelling industry to begin to consider how best to leverage these recent advances in ML to inform tunnelling operations.

## Acknowledgements

This project was supported by the Royal Academy of Engineering under the Research Fellowship scheme and by the Engineering and Physical Sciences Research Council (grant number EP/T006900/1).

### Related search

By Keyword
By Author

No search history

### Recently Viewed

• Brian B Sheil
,
Stephen K Suryasentana
,
Michael A Mooney
and
Hehua Zhu