The proliferation of data collected by modern tunnelboring machines (TBMs) presents a substantial opportunity for the application of machine learning (ML) to support the decisionmaking process onsite with timely and meaningful information. The observational method is now well established in geotechnical engineering and has a proven potential to save time and money relative to conventional design. ML advances the traditional observational method by employing data analysis and pattern recognition techniques, predicated on the assumption of the presence of enough data to describe the physics of the modelled system. This paper presents a comprehensive review of recent advances and applications of ML to inform tunnelling construction operations with a view to increasing their potential for uptake by industry practitioners. This review has identified four main applications of ML to inform tunnelling – namely, TBM performance prediction, tunnellinginduced settlement prediction, geological forecasting and cutterhead design optimisation. The paper concludes by summarising research trends and suggesting directions for future research for ML in the tunnelling space.
A  face area of the tunnelboring machine 
a  parameter controlling the fuzziness of the system using the fuzzy cmeans clustering algorithm 
b  adjustable bias vector 
C  pipe string convergence 
c  mean of the Gaussian function 
c′  soil cohesion 
D  diameter of the tunnelboring machine 
D _{c}  diameter of the tunnelboring machine cutting disc 
d _{ j }  centre of data cluster j 
E _{l}  Young’s modulus of the tunnel lining 
E _{s}  Young’s modulus of the soil 
EI  flexural rigidity 
f( x )  latent function representing the underlying structure of the data 
g*  global best historical location for the particle swarm optimisation algorithm 
H  tunnel cover depth 
H _{w}  height of the groundwater table above the tunnelboring machine 
i _{t}, i _{l}  transverse and longitudinal inflection points of the soil surface settlement curve, respectively 
J _{FCM}  fuzzy cmeans clustering objective function 
K  coefficient of lateral earth pressure 
k _{s}  soil permeability 
k _{sub}  modulus of subgrade reaction 
k(x,x′)  covariance function of input pairs x and x′ 
L  tunnel length 
L _{ ε }(y)  εinsensitive cost function used in the support vector machine algorithm 
M _{ij}  fuzzy cmeans clustering membership matrix 
n _{f}, n _{hi}, n _{o}  number of features, neurons in hidden layer i and outputs (respectively) in an artificial neural network 
n _{p}, n _{c}  number of data points and clusters, respectively, using fuzzy cmeans clustering 
P _{c}  chamber pressure 
P _{f}  tunnelboring machine face pressure 
P _{g}  grout pressure 
Q _{f}  conditioning foam flow rate 
q  rock quartz content 
R  tunnel radius 
R _{d}  soil relative density 
S _{max}  maximum soil surface settlement 
S(X,Y)  soil surface settlements at the settlement monitoring point position at X,Y 
s _{c}  spacing of tunnelboring machine cutting discs 
s _{u}  undrained shear strength of the soil 
T  cutterhead torque 
t  time 
t _{c}  thickness of the tunnelboring machine cutting disc 
t _{l}  tunnel lining thickness 
U _{max}  maximum horizontal soil displacement 
V _{g}  volume of grout 
V _{L}  volume loss 
v _{i}  velocity of particle i used in the particle swarm optimisation algorithm 
w  adjustable weight vector 
X  horizontal (transverse) distance to the settlement monitoring point 
 local (for particle i) best historical location for the particle swarm optimisation algorithm 
Y  horizontal (longitudinal) distance to the settlement monitoring point ahead of the face of the tunnelboring machine 
α  orientation of the planes of weakness in the rock mass 
β _{g}, β _{l}  global and local learning parameters, respectively, for the particle swarm optimisation algorithm 
γ  soil unit weight 
γ _{SVM}  support vector machine kernel coefficient 
ϵ  Gaussian noise 
ζ(x)  Gaussian membership function of an input value x 
θ  tunnelboring machine pitching angle 
κ  slope of the soil unload–reload curve 
μ (x)  mean vector of a Gaussian process 
ν _{l}  Poisson’s ratio of the tunnel lining 
ν _{s}  Poisson’s ratio of the soil 
ρ _{1}, ρ _{2}  two randomly initiated vectors with entries ranging between 0 and 1 
σ  standard deviation of the Gaussian function 
ϕ′  soil friction angle 
ψ′  soil dilation angle 
Rapid urbanisation points to the use of underground space as one of the most viable, sustainable and efficient means of delivering new services and transport in congested urban areas. The use of trenchless technology in infrastructure construction is growing in popularity for its cost and environmental savings compared with conventional openexcavation techniques (Royston et al., 2020b). In these obstructed underground spaces, optimising the performance of tunnelling operations is critical to ensure safe and economical construction while also preventing damage to existing infrastructure both above and below ground (Chieh et al., 2020).
Traditionally, tunnelling contractors have relied on empiricism, in addition to more formal design calculations. While simplified design calculations play an important role in tunnel design and construction, optimising tunnelling operations remains technically challenging due to their dependence on several complex factors, such as site geology, tunnelboring machine (TBM) operational parameters and tunnel geometry (O’Dwyer et al., 2018, 2020; Phillips et al., 2019). Although a significant body of research conducted over the past 30 years has greatly enhanced understanding of these effects and their influence on tunnelling operations, the literature contains many examples where static ‘rulebased’ design methods fail to provide satisfactory prediction of field behaviour – for example, the papers by Barla et al. (2006), Choo and Ong (2015) and Sheil et al. (2016).
The proliferation of data collected by modern TBMs presents a substantial opportunity for the application of machine learning (ML) to support the decisionmaking process onsite with timely and meaningful information (Sheil et al., 2020). While Shreyas and Dey (2019) present a highlevel overview of machine techniques for tunnelling settlement and performance prediction, a more comprehensive review of recent advances and applications of ML to inform tunnelling construction operations is warranted to increase their potential for uptake by industry practitioners. To this end, this review has identified four main applications of ML to inform tunnelling – namely, TBM performance prediction, tunnellinginduced settlement prediction, geological forecasting and cutterhead design optimisation. The paper concludes by summarising research trends and suggesting directions for future research for ML in the tunnelling space.
The practice of ML has experienced immense recent growth, driven by advances in computational performance, sensing technology and data storage. In geotechnical engineering, ML advances the traditional observational method by employing data analysis and pattern recognition techniques, predicated on the assumption of the presence of enough data to describe the physics of the modelled system. These observational techniques have a proven potential to save time and money relative to conventional design (e.g. Royston et al., 2020a; Sheil et al., 2018).
‘Artificial intelligence’ (AI), ‘ML’ and ‘deep learning’ are three terms often used interchangeably to describe software that behaves in an intelligent manner. ML is a subset of AI that provides systems with the ability to learn and perform certain tasks automatically without being explicitly programmed. The most common implementation of ML involves the development of relationships between inputs and outputs. In the case where outputs are provided known labels (i.e. the correct outputs are known), then the learning process is referred to as ‘supervised’. This contrasts with ‘unsupervised’ learning, where instances do not have corresponding labels. Deep learning is a further subset of ML that uses a specific ML algorithm called ‘deep’ artificial neural networks (ANNs), with many hidden layers, to learn from large amounts of data.
A drawback of many supervised learning techniques is the requirement for a large database of highquality information to accurately capture the physics of the modelled system. The size of the data set required for the training process is highly dependent on the type of ML technique adopted, its intended role (e.g. interpolation, optimisation, forecasting) and the complexity of the input–output relationship being modelled. This section provides a brief overview of ML techniques commonly applied to tunnelling operations.
An ANN is an informationprocessing paradigm that draws inspiration from the operation of the human brain. A network consists of multiple interconnected layers of neurons, comprising a layer of input neurons, one or more layers of ‘hidden’ neurons that perform operations on the data and a layer of output neurons. Transformation of the input data is performed by the artificial neurons through the application of a nonlinear function (known as the activation function) of the sum of weighted inputs (see Figure 1). In its simplest form (a feedforward neural network), data travel in one direction – from input to output. After each complete iteration, termed ‘epochs’, the network output values are compared with the target values to produce an error measurement. Feedback of the error through the network, known as ‘backpropagation’, is a nonlinear optimisation process that adjusts the weight and bias of each connection towards reducing the value of the cost function. In this paper, the ‘architecture’ describes the network structure in the form n _{f}…n _{h} _{ i }…n _{o}, where n _{f}, n _{h} _{ i } and n _{o} are the number of features, neurons in hidden layer i and outputs, respectively.
An alternative network form is a recurrent neural network (RNN) wherein connections between units form a directed cycle. This allows the network to maintain information in ‘memory’ over time and therefore use historical calculations to determine outputs. Long shortterm memory (LSTM) is a type of RNN that uses a ‘memory cell’ that can store information for long periods of time. A set of ‘gates’ is used to decide whether information is stored in the memory cell, when information from the memory cell is deployed in the network or when information is removed from the cell altogether (i.e. forgotten).
Fuzzy logic (FL) involves the integration of expert knowledge and experience into a fuzzy inference system using fuzzy ‘If–Then’ rules to model the qualitative aspects of human knowledge. This allows an extension of binary, classic logic to qualitative, subjective and approximate situations. Takagi and Sugeno (1985) presented the first systematic investigation of fuzzy modelling. The purpose of a fuzzy inference system is to map inputs to outputs through the application of fuzzy reasoning. Fuzziness is first applied to the inputs to produce a fuzzy set using a ‘membership function’, ζ(x), such as the Gaussian membership function:
where x is the input value and σ and c are the standard deviation and mean of the Gaussian function, respectively. The resulting fuzzy set is processed using a set of If–Then rules. The results are subsequently defuzzified to produce ‘crisp’ outputs.
Adaptive neurofuzzy inference systems (ANFISs) denote the fusion of neural networks with FL principles. The key difference to traditional neural networks is that part or all nodes in the network are modified to be ‘adaptive’. This means that the outputs of the network are now dependent on the nodal parameters and the learning rule updates the parameters to minimise a prescribed error measurement. Relationships between variables are defined using fuzzy If–Then rules. ANFIS networks are typically organised in five layers as follows: (a) layer 1 is the input layer comprising the adaptive nodes and node functions and activates the fuzziness of the inputs, (b) layer 2 determines the firing strength of each rule, (c) layer 3 normalises the firing strengths, (d) layer 4 defines the consequence parameters and (e) layer 5 computes the ANFIS outputs by summing the outputs of layer 4.
Conventional clustering techniques assign data to a cluster without consideration of the extent of its ‘belonging’ to that cluster. First introduced by Dunn (1973), fuzzy cmeans clustering (FCM) is a clustering approach that allows a data point to belong to multiple clusters with varying degrees of membership. This method uses an iterative clustering technique to produce an optimal ‘d’ through the minimisation of an objective function J _{FCM}:
Classification and regression trees (CARTs) are a nonparametric method that builds classification or regression models in the form of a tree structure. At each tree node, a specified number of features are randomly selected and tested to achieve an optimal split of the data. Although decision trees can be highly effective, they are prone to overfitting and are sensitive to the specific data set on which they are trained. A robust solution to overfitting is the concept of random forests (RFs), first proposed by Breiman (2001). An RF is an ensemble learning method that operates by building multiple decision trees and aggregating the results (see Figure 2). Multiple different training sets (termed ‘bootstrap samples’) are generated by sampling with replacement randomly from the original data. This method builds several instances of a decision tree that produces an output ŷ _{ i } corresponding to each tree. All individual outputs are then averaged to obtain the final prediction, ŷ.
A Gaussian process is a collection of random variables of which any finite number follows a joint Gaussian distribution (Williams and Rasmussen, 1996). Gaussian process regression (GPR) provides a method for performing Bayesian inference about functions in a nonparametric way. One of the key aspects of GPRs is the use of covariance functions that encode prior assumptions about the functions that one wishes to learn (in this case the measured data). This avoids reliance on algebraic mapping between inputs and outputs. The overall aim of the process is to learn a regression model of the form y = f(x) + ϵ, where f(x) is a latent function representing the underlying structure of the data and ϵ ∼ N(0, σ ^{2}) is a Gaussian noise term where σ ^{2} is the variance of the noise (the symbol ‘∼’ means ‘distributed according to’). A Gaussian process can be completely described by a mean vector, μ ( x ), and covariance function k( x , x ′) of input pairs x and x ′ to describe an underlying real process f(x) as follows:
where
The term ‘support vector regression’ (SVR) denotes the application of support vector machines (SVMs) to regression problems. The εinsensitive approach first proposed by Vapnik (1995) is one of the most widely adopted SVM/SVR approaches in the literature. SVR uses either linear or nonlinear kernels to map the input space into a highdimensional feature space. The most common kernel adopted for this purpose is the radial basis function (RBF):
where γ _{SVM} is a kernel coefficient. A hyperplane is subsequently constructed in the feature space where the quality of fit to the data is computed using an εinsensitive cost function (L _{ ε }(y)) defined as follows (see Figure 3):
where x is the input data with target values y; f(x) is the regression function; and ε is a userdefined positive value representing the maximum distance between f(x) and y for which there is no loss in the cost function. According to Equation 7, only predictions that have residuals greater than ε are penalised, while predictions with smaller residuals have no effect on the regression equation. Considering a linear function as an example, f( x ) can be defined as follows:
where w is an adjustable weight vector and b is the bias. The objective is to obtain a function that has the smallest ε deviation from the target values in the training data and is also as ‘flat’ as possible (by minimising the Euclidean norm  w ^{2}).
Extreme learning machine (ELM) is a threelayer neural network – that is, it comprises a single hidden layer (Huang et al., 2004). The novelty of ELM centres around its use of randomly generated hyperparameters for the hidden layer, which are not updated during training, unlike conventional neural networks (Huang et al., 2006). This significantly reduces the computational time associated with the learning process and increases the ability of the network to generalise within the trained parameter space. The ELM training process involves the generation and selection of random numbers for the weight and bias matrices for the hidden layer (Huang et al., 2011). Since the number of neurons in the hidden layer is typically much less than the number of training observations, the network is an overdetermined linear system. A consequence of this is that the output weight matrix is the only parameter that needs to be optimised during training, which can be undertaken using an ordinary leastsquares approach.
First proposed by Holland (1992), genetic algorithms (GAs) are arguably the most popular variant of evolutionary algorithm. These methods are a computational model inspired by evolution and the mechanisms of natural selection and are typically deployed as search and optimisation algorithms. The parameters of the userdefined search space are first encoded in the form of chromosomes, which can in turn be grouped to form a population. The process begins by initiating a random population representing different nodes in the search space. The fitness (cost) function is then evaluated for each node to determine the fitness value. New search nodes are randomly generated by applying genetic operations on the nodes based on their fitness values. This process is repeated until an optimal solution is acquired. The purpose of the genetic operators is to combine the ‘good’ structures of each node to produce an improved search node. Common genetic operators are shown in Figure 4 and include (a) crossover (portions of chromosomes are swapped), (b) reproduction (chromosomes with good fitness values in an old population are preserved in the new population) and (c) mutation (occasional random alteration of a chromosome).
Three alternative evolutionary algorithms include (a) genetic programming (GP), (b) gene expression programming (GEP) and (c) differential evolution (DE). The fundamental difference between these approaches lies primarily in the composition of the individuals within the respective populations. In GAs, individuals are linear chromosome strings of fixed length; in GPs, they are nonlinear units with varying shapes and sizes; in GEPs, they are encoded linear strings of fixed length (similar to GA chromosomes), which are subsequently expressed as nonlinear units of varying shapes and sizes; and in DEs, they are real vectors rather than binary chromosome strings.
The imperialist competitive algorithm (ICA) is an alternative evolutionary search and optimisation algorithm proposed by AtashpazGargari and Lucas (2007) and is derived from human beings’ sociopolitical evolution. In this case, the initial population is termed ‘countries’ and is broken into two categories: (a) colony and (b) imperialist state. A cost function is used to determine which countries of the initial population are the most ‘powerful’ and are therefore selected as imperialist states. The remaining countries are assigned as colonies of the imperialist states depending on the value of the cost function for each imperialist state. The imperialist state and their respective colonies are denoted an empire. The ensuing optimisation process is described by Figure 5.
A wide range of ML techniques have been developed for tunnelling applications. Research areas have included TBM automation (Mokhtari and Mooney, 2019), tunnel condition assessment (Chen et al., 2019a; Li et al., 2017; Zhu et al., 2020), anomaly detection (e.g. Sheil et al., 2020; Yu et al., 2018), tunnel profile measurement (e.g. Xue and Zhang, 2019), resilience assessment (e.g. Khetwal et al., 2019), structural defect identification (e.g. Ding et al., 2019), tunnel face stability (e.g. Hayashi et al., 2019), rockburst prediction (e.g. Liu and Hou, 2019) and intelligent building information modelling (e.g. Zhao et al., 2019a). This review focuses on four tunnelling applications where the use of ML has been most prevalent: (a) TBM performance prediction, (b) tunnelinduced settlement prediction, (c) geological forecasting and (d) cutterhead design optimisation.
A large body of research has focused on the development of improved TBM performance predictions by leveraging recent advances in ML. Table 1 presents an overview of these studies where the corresponding parameters and notation are defined in Figure 6 (a slurry pressure balance shield machine is shown for illustrative purposes) and Table 2. Research into TBM performance has been largely confined to openmode TBM tunnelling in rock with only a handful of efforts with slurry or earth pressure balance (EPB) shield TBMs in softer soils (e.g. Mokhtari and Mooney, 2020; Mokhtari et al., 2020; Mooney et al., 2018). Mooney et al. (2018) note that maintenance of EPB, TBM guidance using thrust and articulation jacks, scraping and imbibing of the in situ ground and muck processing through a depressurising screw conveyor combine to make performance prediction of EPB TBMs particularly challenging.

Reference  ML algorithma  Featuresb  Predictandc  TBM typed  Data set and size 

Grima et al. (2000)  ANFIS  Core fracture frequency, UCS, RPM, thrust/cutter, D _{c}  PR  —  640 – various tunnels worldwide 
Benardos and Kaliampakos (2004)  ANN (8941)  Rock mass fracture degree, RMW, SF, RMQ, UCS, H, WT, k  AR  O  11 – Athens metro tunnel (Greece) 
Simoes and Kim (2006)  FL (rule and parametricbased)  D, RQD, RMR, water inflow rate  Utilisation  O  Milyang tunnel (South Korea) Queens water tunnel (USA) Manapouri tunnel (New Zealand) 
Mohammadi et al. (2007)  RBFANN (8?1)  RQD, UCS, SF, WT, RMW, RMR, H, k  AR  O  11 – Athens metro tunnel (Greece) 
Zhao et al. (2007)  Ensemble neural network  UCS, DPW, α, BI  SBI  EPB  47 – the Deep Tunnel Sewerage System (Singapore) 
Acaroglu et al. (2008)  FL  UCS, BTS, D _{c}, t _{c}, s _{c}, penetration  SER  —  Linear cutting tests 
Benardos (2008)  ANN (8951); ANN (8941)  RQD, RMW, SF, RMR, UCS, H, WT, k  PR  O  330 – Maen tunnel 301 – Pieve tunnel (Italy) 11 – Athens metro tunnel (Greece) 
Mikaeil et al. (2009)  Multifactorial fuzzy evaluation  UCS, BTS, PSI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Yagiz et al. (2009)  ANN (481); nonlinear multivariate regression  UCS, BI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Gholamnejad and Tayarani (2010)  ANN (39731)  UCS, RQD, DPW  PR  O  185 – Queens water tunnel (USA), Karaj–Tehran water tunnel (Iran), Gilgel Gibe II tunnel (Ethiopia) 
Yagiz and Karahan (2011)  PSO  UCS, BI, DPW, α, TBM field data  PR  O  151 – Queens water tunnel (USA) 
Maher (2013)  Linear regression; polynomial regression (degree = 3); SVR (linear, polynomial kernels)  12 TBM parameters  PR  EPB  Seattle subway tunnel (USA) 
Oraee et al. (2012)  ANFIS  RQD, UCS, DPW  PR  O  177 – Queens water tunnel (USA), Gilgel Gibe II tunnel (Ethiopia) 
Ge et al. (2013)  Leastsquares SVM  UCS, BTS, PSI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Ling et al. (2013)  Partial leastsquares FNN  UCS, BTS, PSI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Martins and Miranda (2013)  ANN (421) SVR (RBF kernel)  UCS, PSI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Mobarra et al. (2013)  ANN (41341)  UCS, PLS, RPM, normal force designation  PR  O  289 – Golab water tunnel (Iran) 
Salimi and Esmaeili (2013)  Linear regression Nonlinear multiple regression ANN (517101)  UCS, BTS, PSI, DPW, α  PR  O  46 – Karaj–Tehran water tunnel (Iran) 
Shao et al. (2013)  Online prediction with ELM incremental learning  UCS, BTS, PSI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Špačková and Straub (2013)  Dynamic Bayesian networks  Ground zone, rock class, H, ground class, human factor, project geometry, CM, failure mode, number of failures  Time  —  Numerical modelling of Suncheon–Dolsan tunnel (South Korea) 
Ghasemi et al. (2014)  FL  UCS, BTS, BI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Mahdevari et al. (2014)  SVR (RBF kernel)  JF, α, UCS, BTS, CP, DPW, SE, BI  PR  O  151 – Queens water tunnel (USA) 
Salimi et al. (2015)  ANN (241); ANFIS; SVR (RBF kernel)  UCS, DPW  PR  O  75 – Zagros water conveyance tunnel (Iran) 
Tao et al. (2015)  RF  UCS, BTS, BI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Yagiz and Karahan (2015)  DE Hybrid harmony search Grey wolf optimiser  UCS, BI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Fattahi (2016)  FCMANFIS  UCS, BI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Salimi et al. (2016)  ANFIS SVR (RBF kernel)  UCS, DPW  PR  O  75 – Zagros water conveyance tunnel (Iran) 
Adoko et al. (2017)  Bayesian inference  UCS, BI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Armaghani et al. (2017)  ANN (7111); PSOANN (7111); ICAANN (7111)  UCS, BI, RQD, RMR, RMW, JF, RPM  PR  O  1286 – PSRWT^{e} tunnel (Malaysia) 
Fattahi and Babanouri (2017)  DESVM; artificial bee colony SVM; gravitational search SVM  UCS, BI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Minh et al. (2017)  FL  UCS, BTS, BI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Mooney et al. (2018)  SVR (RBF kernel); RReliefF feature selection)  JF, CP, Q _{F}, H, H _{w}  AR  EPB  Seattle University link tunnel (USA) 
Armaghani et al. (2018)  GEP  UCS, BTS, RQD, RMR, RMW, JF, RPM  PR  O  1286 – PSRWT tunnel (Malaysia) 
Mikaeil et al. (2018)  Multifactorial fuzzy evaluation approach  UCS, PSI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Adoko and Yagiz (2019)  FCM clustering; subtractive clustering; ANFIS; knowledgebased fuzzy inference  Rock type, UCS, BI, α, DPW, JF  FPI  O  151 – Queens water tunnel (USA) 
Armaghani et al. (2019)  PSOANN (8121); ICAANN (8121)  UCS, BTS, RMR, RQD, q, RMW, JF, RPM  AR  O  1286 – PSRWT tunnel (Malaysia) 
Cachim and Bezuijen (2019)  Time series ANN  Foam injection ratio, lagged values of torque  T  EPB  Botlek rail tunnel (the Netherlands) 
Gao et al. (2019)  RNN; longshort term memory networks; gated recurrent networks  44 TBM parameters  T, velocity, JF, P _{c}  EPB  Shenzhen subway tunnel (China) 
Koopialipoor et al. (2019a)  Group method of data handling  UCS, BTS, RQD, RMR, RMW, JF, RPM  PR  O  1286 – PSRWT tunnel (Malaysia) 
Koopialipoor et al. (2019b)  ANN (583281)  UCS, BTS, RQD, RMR, RMW  PR  O  1286 – PSRWT tunnel (Malaysia) 
Naghadehi et al. (2019)  ICAGEP  UCS, BTS, BI, DPW, α  PR  O  151 – Queens water tunnel (USA) 
Salimi et al. (2019)  CART; GP  UCS, RQD, DPW, joint condition  FPI  —  Various tunnels worldwide 
Shi et al. (2019a)  FCM clustering; attribute correlation guided FCM clustering  53 TBM parameters  PR  EPB  Tunnel in China 
Song et al. (2019)  Time series segmentation guided by FCM clustering  53 TBM parameters  PR  EPB  Tunnel in China 
Xu et al. (2019)  kNN; SVR (RBF kernels); ANN (6?1); CART; chisquared automatic  UCS, BTS, RQD, RMW, JF, RPM  PR  O  1286 – PSRWT tunnel (Malaysia) 
Zhou et al. (2020)  ANN (621); GP  UCS, RQD, RMR, BTS, JF, RPM  AR  O  1286 – PSRWT tunnel (Malaysia) 
Koopialipoor et al. (2020)  Hybrid fireflyANN (781)  UCS, BTS, RQD, RMR, RMW, JF, RPM  PR  O  1286 – PSRWT tunnel (Malaysia) 
Mokhtari et al. (2020)  Elastic net regression  RPM, JF, Q _{F}, CP, SCP, H  AR  EPB  Seattle Northlink Extension tunnel (USA) 
Mokhtari and Mooney (2020)  SVR (RBF kernel); RReliefF feature selection)  RPM, JF, Q _{F}, CP, SCP  AR  EPB  Seattle Northlink Extension tunnel (USA) 
aNN, fuzzy neural networks; kNN, knearest neighbours
bCM, construction method; CP, cutterhead power; H, cover from surface to TBM; JF, cutterhead jacking force; Q _{F}, conditioning foam flow rate; RPM, cutterhead rotation speed; SCP, screw conveyor power; SE, specific energy; SF, stability factor; H _{w}, height of the groundwater table above TBM; WT, elevation of the groundwater table
cAR, advance rate; FPI, field penetration index; PR, penetration rate; SBI, specific rock mass boreability index; SER, specific energy requirement
dEPB, earth pressure balance; O, openmode hard rock
ePSRWT, Pahang–Selangor Raw Water Transfer

Parameter  Definition 

UCS  Unconfined compressive strength 
DPW  Distance between planes of weakness 
α  Joint orientation 
BTS  Brazilian tensile strength 
BI  Brittleness index 
RQD  Rock quality designation 
RMW  Rock mass weathering 
RMR  Rock mass rating 
PSI  Point strength index 
RMQ  Rock mass quality 
q  Quartz content 
PLS  Point load strength 
From Table 1, it is notable that penetration rate (PR) is the most favoured measure of TBM performance, defined as the penetration along the axis of the tunnel per unit tunnelling time (i.e. downtimes/stoppages are not included in the calculation). It can be observed that the input parameters are dominated by ground (rock) properties, with unconfined compressive strength (UCS) being the most common. For example, the study by Benardos and Kaliampakos (2004) was one of the earliest ones to use rock mass properties (e.g. UCS, rock mass rating, weathering) as inputs to an ANN for TBM performance prediction where an error of 6–8% was obtained. For softer soils, Mooney et al. (2018) noted that TBM performance was most influenced by cutterhead torque, foam flow rate and screw conveyor rotation speed. It is noteworthy that the selection of input parameters has been predominantly guided by empiricism from previous literature and the application of more robust ‘feature engineering’ techniques in this area has been limited. Using principal component analysis, Salimi et al. (2015, 2016, 2019) confirmed the strong dependence of TBM performance on rock mass parameters (e.g. UCS, rock quality designation, joint spacing and condition) for hardrock tunnelling.
Another interesting observation is the inusitation of TBM operational (e.g. jacking force (JF), cutterhead torque (T), cutterhead rotation speed (RPM), slurry parameters) and geometric parameters (e.g. tunnel diameter, distance from reception shaft, soil cover) as features. This is because many of the training data sets relate to a single construction project and it is a common assumption that TBM and geometric parameters remain constant during a given project and so should not be included in the ML. While this provides good predictability on a casebycase basis (where one might wish to forecast the performance of the TBM for the current project based on the data gathered thus far), it limits the applicability of these trained ML models to other projects. This is particularly important in the case of ML models, as they typically demonstrate a poor ability to extrapolate beyond their calibration space (Ahmed et al., 2010). Recent studies incorporating the influence of TBM operational parameters for performance prediction have demonstrated an improved ability to generalise, for example, to alternative excavation techniques (Song et al., 2019).
The most common ML technique adopted for the prediction of TBM performance is a multilayer feedforward ANN with backpropagation. The main difference between the ANN models adopted in the literature is the optimal ANN architecture that was ultimately selected. Even though similar input parameters and data sets have been employed across various studies, the range of architectures that have been adopted is quite wide. For example, Armaghani et al. (2017) and Koopialipoor et al. (2019b) adopted 7111 (n _{f}n _{h1}n _{o}) and 583281 architectures, respectively, for the prediction of the same data set (the Pahang–Selangor raw water transfer tunnel (PSRWT)). It is noteworthy that the use of several hidden layers and neurons increases the likelihood of encountering overfitting. HechtNielsen (1987) proved that any continuous function can be represented by a neural network using a single layer with n _{h1} = 2n _{f} + 1 nodes, albeit using significantly more complex activation functions than the conventional sigmoidal functions commonly adopted in the literature. This corresponds to architectures of 7151 and 5111, respectively, for these studies.
Other popular ML methods adopted in the literature include FL, due to its ability to incorporate empirical evidence/experience and, recently, more flexible and nonlinear ML algorithms such as CARTs (e.g. Xu et al., 2019) and RFs (e.g. Tao et al., 2015). To develop improved methods for the determination of the optimum architecture and the avoidance of local minima, hybrid methods have also been explored by fusing ML models with optimisation algorithms such as ICA (e.g. Naghadehi et al., 2019), PSO (e.g. Armaghani et al., 2018), DE (e.g. Fattahi and Babanouri, 2017) and FCM (e.g. Fattahi, 2016).
Table 3 presents an overview of ML models adopted for the prediction of tunnellinginduced soil settlements, s, as well as tunnel convergence, C. Given the complex nature of tunnellinginduced settlements, the number of features used in these models is notably greater. Furthermore, these features comprise a mix of soil, tunnel geometry and TBM operational parameters. For the studies considered in this review, ANNs appear to have been the ML model of choice pre2012, although they continue to appear in more recent literature. It is again apparent that a wide range of architectures have been explored from the 474747472 architecture adopted by Kim et al. (2001) to the more compact 341 architecture proposed by Hasanipanah et al. (2016) and Moghaddasi and NoorianBidgoli (2018).

Reference  ML algorithma  Salient featuresb  Predictandc  Data set and sized 

Shi et al. (1998)  ANN (8241)  L, H, A, delay in closing inverted arch, WT, AR, CM, SPT values  S _{max}  356 – Brasilia tunnel (Brazil) 
Kim et al. (2001)  ANN (474747472)  47 tunnel, TBM and soil parameters  S _{max}, i  113 – Seoul subway tunnel (South Korea) 
Neaupane and Adhikari (2006)  ANN (6391)  H, D, s _{u}, V _{L}, CM, WT  S _{max}  26 – various projects worldwide 
Neaupane and Adhikari (2006)  ANN (6351)  H, D, s _{u}, V _{L}, CM, WT, S _{max}  U _{max}  26 – various projects worldwide 
Suwansawat and Einstein (2006)  ANN (10201)  H, L, geology at crown and invert, WT, P _{f}, PR, θ, P _{g}, V _{g}  S _{max}  49 – Bangkok MRT tunnel (Thailand) 
Yoo and Kim (2007)  ANN (844)  H, WT, support pattern, geologies, soil layer thicknesses  S _{max}, C, lining stresses  95 – highspeed railway tunnel (South Korea) 
Santos and Celestino (2008)  ANN (141261)  14 parameters: tunnel geometry and ground conditions  S _{max}  81 – São Paulo subway tunnel (Brazil) 
Boubou et al. (2010)  ANN (11771)  AR, T, P _{f}, P _{g}, V _{g}, JF, time, steering deviations, total work, X/H  S(X)  432 – Toulouse subway tunnel (France) 
Franza et al. (2018)  ANN (542)  H, R _{d}, V _{L}, X, H  S(X,Z), U(X,Z)  Centrifuge test data 
Goh and Hefney (2010)  ANN (851)  H, AR, P _{f}, SPT at crown and springline, wc, E _{s}, P _{g}  S _{max}  148 – MRT tunnel (Singapore) 
Kongsomboon et al. (2010)  ANN (1415151)  H, D, L, Y, geology, WT, P _{f}, PR, θ, P _{g}, V _{g}  U _{max}  38 – Chaloem Ratchamongkol MRT and Bangkok water conveyance tunnels (Thailand) 
Qiao et al. (2010)  ANN (13201)  H, L, geologies, WT, P _{f}, PT, θ, P _{g}, V _{g}  S _{max}  49 – Bangkok MRT tunnel (Thailand) 
Tsekouras et al. (2010)  ANN (5113)  Stability, lining placement, t _{l}, E _{l}, Y  S _{max}, U _{max}, V _{max}  7650 – finitedifference analyses 
Ninic et al. (2011)  ANN (6141)  P _{f}, P _{g}, κ, H, X, Y  S(X,Y)  2160 – finiteelement analyses 
Darabi et al. (2012)  ANN (5171)  c′, ϕ′, E _{s}, H, D  S _{max}  50 – various projects in Iran and Turkey 
Li et al. (2012)  PSOSVM with chaotic mapping  —  C _{h}  39 – Xiakeng tunnel (China) 
Mahdevari and Torabi (2012)  ANN (935281); RBFANN (935281)  H, GSI, RQD, compressive and tensile strength, c′, ϕ′, E _{s}, UCS  C _{ave}  60 – Ghomroud water tunnel (Iran) 
Mahdevari et al. (2012)  SVM (RBF kernels); ANN (935281)  H, GSI, RQD, compressive and tensile strength, c′, ϕ′, E _{s}, UCS  C _{ave}  60 – Ghomroud water tunnel (Iran) 
Marto et al. (2012)  ANN (9241)  H, SPT, wc, c′, ϕ′, E _{s}, γ, ν _{s}, Y  S(Y)  160 – Karaj urban railway tunnel (Iran) 
Pourtaghi and LotfollahiYaghin (2012)  WaveletANN (13101)  H, L, geologies, WT, P _{f}, PR, P _{g}, θ, V _{g}  S _{max}  49 – Bangkok MRT tunnel (Thailand) 
Rafiai and Moosavi (2012)  ANN (11742)  R, rock stresses, c′, ϕ′, E _{s}, ν _{s}, ψ′, t _{l}, E _{l}, ν _{l}  C _{h}, C _{v}  2500 – finitedifference analyses 
Adoko et al. (2013)  MARS; ANN (820261)  Rock class rating index, c′, ϕ′, E _{s}, γ, H, Y, time  C _{ave}  390 – CKTJ9 highspeed railway tunnel (China) 
Khatami et al. (2013)  ANN (6151)  Building EI, width, weight. Distance between tunnels, H, X  Building settlement  160 – finiteelement analyses 
Mahdevari et al. (2013)  SVM (RBF kernels)  wc, γ, c′, ϕ′, E _{s}, k _{sub}  C _{ave}  75 – Amirkabir tunnel (Iran) 
Ninić et al. (2013)  PSOANN (6201)  E _{s}, K, P _{f}, Y, X, P _{g}  S(X,Y)  625 – finiteelement analyses 
Ocak and Seker (2013)  ANN (1891); SVM (RBF kernels); GPR (SQ kernels)  18 tunnel, TBM and soil parameters  S  230 – Istanbul metro tunnel (Turkey) 
Bouayad and Emeriault (2014)  ANFIS combined with PCA  6 TBM and soil parameters  S  432 – Toulouse subway tunnel (France) 
Guo et al. (2014)  Elmantype PSORNN (4201)  H, JF, P _{f}, V _{g}  S  Jiangji subway tunnel (China) 
Ahangari et al. (2015)  ANFIS; GEP  H, D, c′, ϕ′, E _{s}  S _{max}  53 – finitedifference analyses 
Behnia and Shahriar (2015)  GEP  H, D, c′, ϕ′, E _{s}  S _{max}  50 – finitedifference analyses 
Bouayad et al. (2015)  Partial leastsquares regression combined with agglomerative hierarchical clustering  11 TBM and tunnel geometry parameters  S  432 – Toulouse subway tunnel (France) 
Khamesi et al. (2015)  Fuzzy systems coupled with (a) PSO, (b) ICA and (c) nearestneighbourhood clustering  K, E _{s}, s _{u}, soil mass number  S  240 – Karaj subway tunnel (Iran) 
Koukoutas and Sofianos (2015)  ANN (14151)  H, WT, geologies, JF, P _{f}, PR, T, P _{g}, V _{g}, excavated material  S _{max}  584 – Athens extension (317), Thessaloniki metro tunnels (267; Greece) 
Mohammadi et al. (2015)  ANN (6141)  H, soil type, γ, c′, ϕ′, E _{s}  S _{max}  17 – Niayesh highway tunnel (Iran) 
Dindarloo and SiamiIrdemoosa (2015)  CART  H, D, V _{L}, normalised V _{L}, s _{u}, WT, CM  S _{max}  34 – tunnels from the UK, the USA, Canada, Thailand, Brazil and Germany 
Cao et al. (2016)  RNN combined with Gappy POD  E _{s}, P _{g}  S  60 – finiteelement analyses 
Hasanipanah et al. (2016)  PSOANN (341)  K, s _{u}, E _{s}  S _{max}  143 – Karaj subway line 2 tunnel (Iran) 
Lai et al. (2016)  ANN (9?2)  H, D, c′, ϕ′, E _{s}, P _{ g }, V _{ g }, JF, PR  S _{max}, trough width coefficient  6 – three tunnel projects in China 
Wang et al. (2016)  Relevance vector machine (RBF kernels)  H, Y, P _{f}, P _{g}, AR, geologies, lagged settlement measurements  S  182 – Wuhan metro line 2 tunnel (China) 
Zhou et al. (2016)  RF (500 trees)  H, D, c′, ϕ′, E _{s}, P _{ g }, V _{ g }, JF, PR  S _{max}  26 – Shanghai, Guangzhou and Nanjing tunnels (China) 
Bouayad and Emeriault (2017)  ANFIS coupled with PCA and agglomerative hierarchical clustering  6 TBM and soil parameters  S  432 – Toulouse subway tunnel (France) 
Kohestani et al. (2017)  RF (270 trees)  H, L, geologies, P _{f}, PR, θ, P _{g}, V _{g}  S _{max}  49 – Bangkok MRT tunnel (Thailand) 
Naeini and Khalili (2017)  ANFIS  H, D, c′, ϕ′, E _{s}  S _{max}  46 – subway tunnels in Iran and Turkey 
Zhang et al. (2017)  Wavelet leastsquares GASVM (RBF kernels)  Measured settlement time histories  S  60 – Wuhan metro line 3 tunnel (China) 
Zhou et al. (2017)  RF (500 trees)  Set A: H, D, V _{L}, normalised V _{L}, s _{u}, WT, CM Set B: H, D, c′, ϕ′, E _{s}, P _{g}, V _{g}, JF, AR  S _{max}, trough width coefficient  66 – various tunnels worldwide 
Fattahi and Babanouri (2018)  Rock engineering systems  H, L, WT, P _{f}, PR, θ, P _{g}, V _{g}, geologies  S _{max}  49 – Bangkok MRT tunnel (Thailand) 
Goh et al. (2018)  MARS  H, AR, P _{f}, SPTs, wc, E _{s}, P _{g}  S _{max}  148 – three MRT tunnels in Singapore 
Mehrnahad and Zekrabad (2018)  ANN (7241)  L, H, γ, E _{s}, c′, ϕ′, P _{f}  S _{max}  181 – Mashhad metro line 2 tunnel (Iran) 
Moeinossadat et al. (2018a)  ANFIS  H, D, H/D, c′, ϕ′, E _{s}, P _{g}, V _{g}, JF, PR  S _{max}  41 – Shanghai subway line 2 tunnel (China) 
Moeinossadat et al. (2018b)  ANFIS; GEP; neurogenetic systems  H, D, c′, ϕ′, E _{s}, P _{g}, V _{g}, JF, AR  S _{max}  41 – Shanghai subway line 2 tunnel (China) 
Moghaddasi and NoorianBidgoli (2018)  ICAANN (341)  K, s _{u}, E _{s}  S _{max}  143 – Karaj subway line 2 tunnel (Iran) 
Sun et al. (2018a)  Multiclass SVM (RBF kernels)  D, H, support stiffness, rock tunnelling quality index  C _{ave}  117 – various tunnels worldwide 
Chen et al. (2019a)  ANN; RBFANN; general regression network  JF, T, P _{f}, PR, V _{g}, H, WT, modified SPT, modified DPT, modified UCS  S _{max}  200 – Changsha metro line 4 tunnel (China) 
Chen et al. (2019b)  ANN; waveletANN; general regression ANN; ELM; SVM; RF  JF, T, P _{f}, PR, V _{g}, H, WT, modified SPT, modified DPT, modified UCS  S _{max}  200 – Changsha metro line 4 tunnel (China) 
Fattahi and Bayatzadehfard (2019)  ANFIS with subtractive clustering; FCMANFIS; ANFIS with biogeographybased optimisation  H, L, WT, P _{f}, PR, θ, P _{g}, V _{g}  S _{max}  49 – Bangkok MRT tunnel (Thailand) 
Hajihassani et al. (2019)  GEP  H, c′, ϕ′, γ, ν _{s}, E _{s}  C _{ave}  118 – Karaj urban railway line 2 tunnel (Iran) 
Hu et al. (2019)  PSOANN; PSOSVR; PSOELM  Measured settlement time histories  S  70 – Zhuhai tunnel (China) 
Liu and Liu (2019)  GAGPR (SE + RQ kernels); GASVM (RBF kernels)  c′, ϕ′, ν _{s}, E _{s}, K, Y  C _{h}, C _{v}  Finitedifference analyses 
Moeinossadat and Ahangari (2019)  GEP  c′, ϕ′, γ, ν _{s}, E _{s}, K, H, P _{f}, surface surcharge  S _{max}  100 – finitedifference analyses 
Ramezanshirazi et al. (2019)  ANN (15301)  15 geometric, TBM and soil parameters  S _{max}  Milan M5 metro tunnel 
Saadallah et al. (2019)  Vector autoregressive with exogenous variables  —  S  160 – finiteelement analyses 
Shi et al. (2019b)  SVM with information granulation using twolayer perceptron kernel  —  C _{ave}  Panlongshan tunnel (China) 
Zhang et al. (2019a)  RF (91 trees)  JF, T, P _{f}, PR, V _{g}, H, WT, modified SPT, modified DPT, modified UCS, ground condition, stoppages  S _{max}  294 – Changsha metro line 4 tunnel set A (China) 
Zhang et al. (2019a)  RF (38–153 trees)  JF, T, P _{f}, PR, V _{g}, H, WT, modified SPT, modified DPT, modified UCS, ground condition  S _{max}  265 – Changsha metro line 4 tunnel set B (China) 
Zhang et al. (2019b)  SVM  H, H/L, geologies  S _{max}  500 – Huquan–Yangjiawan section of Wuhan metro tunnel (China) 
Zhu et al. (2019)  Bayesian networks  Seasonal parameters  S  2762 – Shanghai metro line 1 (China) 
Hajihassani et al. (2020)  PSOANN (8121)  H, AR, SPT, c′, ϕ′, γ, ν _{s}, E _{s}  S _{max}, i _{t}, i _{l}  123 – Karaj urban railway line 2 tunnel (Iran) 
Yan et al. (2020)  ANNSVRELM ensemble algorithm  Measured settlement time histories  S _{max}  70 – Zhuhai tunnel (China) 
Zhang (2020)  MARS  H, AR, P _{f}, SPTs, wc, E _{s}, P _{g}  S _{max}  148 – three MRT tunnels in Singapore 
Zhang et al. (2020)  ANNSVRMARS ensemble algorithm using XGBoost  H, AR, P _{f}, SPTs, wc, E _{s}, P _{g}  S _{max}  148 – three MRT tunnels in Singapore 
aMARS, multivariate adaptive regressive splines; PCA, principal component analysis; POD, proper orthogonal decomposition
bSPT, standard penetration test; wc, soil water content; GSI, geological strength index; k _{sub}, modulus of subgrade reaction; κ, slope of soil unload–reload curve
c i _{t}, i _{l}, transverse and longitudinal inflection points, respectively; u, horizontal movement
dMRT, mass rapid transit
DPT, dynamic penetration test; RQ, rational quadratic; SE, squared exponential
While the integration of fuzzy systems has also been used to predict tunnelinduced settlements, the use of SVMs became popular post2012, quickly followed by more complex and nonparametric methods such as CARTs and RFs. The prominence of these methods for settlement prediction is perhaps explained by the increased complexity of the input–output mapping process for tunnellinginduced settlements. The data sets used for predicting tunnelinduced settlements are also largely based on a single project rather than multiple projects, with the size of the data set varying considerably (from 6 to 7650 data points).
Efforts to predict ahead of the TBM involve identification of geological conditions, as well as the size and location of potential obstacles (Schaeffer and Mooney, 2016). In these cases, it is desirable to identify changes in soil conditions as shown in Figure 7. To obtain actionable information during tunnelling, soil conditions must be forecasted sufficiently far in advance of the TBM (typically metres to tens of metres). This is complicated by a deterioration in the accuracy of forecasting techniques with an increase in the forecast horizon.
One approach is to consider the TBM itself as an exploratory tool. A popular implementation of this approach is to use statistical interpolation techniques first (such as kriging) to develop an initial estimate of the ground conditions at the TBM face using available borehole information as shown in Figure 8 (Gangrade and Mooney, 2019; Grasmick et al., 2020). These predictions are subsequently updated using TBM driving data to obtain a more reliable estimate of the ground immediately ahead of the TBM. This methodology was adopted by Yamamoto et al. (2003) and Sun et al. (2018b). In particular, Sun et al. (2018b) achieved a prediction accuracy of R ^{2} = 0.8 using RFs.
Alternatively, ML can also be used to provide a direct mapping between TBM performance parameters and ground conditions. This approach can be considered the inverse of the techniques reviewed for TBM performance prediction. Liu et al. (2019) used SVR combined with a stacked singletarget technique to identify multiple targets from a common data set, such as UCS, brittleness index (BI), distance between planes of weakness (DPW) and α. This allowed correlation between targets to be incorporated into the prediction model. The driving data used to identify the target variables included RPM, PR, JF, T and cutterhead power, where a prediction accuracy of R ^{2} between 0.63 and 0.83 was achieved. It is notable that R ^{2} = 0.83 corresponded to the UCS prediction, indicating its strong correlation with TBM performance in rock. Zhang et al. (2019c) used SVM, RF and knearest neighbours to map RPM, T, JF and advance rate (AR) to rock mass type. Zhao et al. (2019b) compared the performance of eight ML models to predict geological type using feature augmentation to improve performance; a traditional ANN was found to provide the best performance. Jung et al. (2019) also used an ANN to predict the ground type from PR, JF and T with an accuracy of R ^{2} > 0.9. The PR parameter was found to be the most influential for predicting ground type, particularly across different sites. Liu et al. (2020) used a hybrid algorithm combining traditional ANNs with simulated annealing to predict rock parameters UCS, BI, DPW and α from RPM, T, JF and PR (R ^{2} between 0.66 and 0.85). Erharter et al. (2019a, 2019b) used ensemble LSTM networks to classify TBM data into rock behaviour types according to four geological ‘indicators’. Yu and Mooney (2020) employed multinomial logistic regression to characterise the fractional representation of four encountered soil types (sand, clay, silt, till deposits) by an EPB TBM. The regression model was trained using RPM, AR, chamber pressure, excavated soil mass, thrust force and 83 boring logs along the alignment.
Instead of using TBM operational parameters, Zhuang et al. (2019) used convergence displacements in rock to infer E _{s} and ν _{s} through inverse analysis. This involved the use of SVR that is optimised using multistrategy artificial fish swarm algorithm (Mafsa). The Mafsa approach is an ensemble algorithm comprising DE, PSO, adaptive step size and phased vision strategy based on the artificial fish swarm algorithm to enhance the global search capability and improve convergence speed and optimisation accuracy.
While numerous geophysical methods have been explored for forecasting geological conditions ahead of the TBM face (e.g. electromagnetic methods, electrical methods, seismic reflection methods, infrared detection methods), very few studies have explored the integration of ML algorithms to improve geophysical predictions. Both Alimoradi et al. (2008) and Von and Ismail (2017) used an ANN to identify rock characteristics using ground parameters obtained from tunnel seismic prediction technology. Although Von and Ismail (2017) reported a prediction accuracy of R ^{2} = 0.85, they noted that the small data sets at the beginning of a project lead to less reliable predictions.
Wei et al. (2018) documented one of the most comprehensive applications of ML to a new ‘Tunnel Lookahead Imaging Prediction System’ (Tulips). The Tulips imaging approach comprises three sets of GPR antennae (low frequency for longrange inspection and two high frequencies to identify small objects) and seismic imaging. The pipeline of their event detection and tracking method is outlined in Figure 9. An experimental campaign showed that buried obstacles can be successfully identified and tracked using this methodology. Those authors also recommended the development and application of more robust ML models to larger data sets including expert interpretations and ground prediction and TBM and geological exploration data.
The final research area covered by this literature review is the optimisation of the cutterhead design (see Figure 10), which appears to have focused exclusively on tunnelling in rock. The literature in this area can be further categorised as an optimisation of the (a) cutter disc layout and (b) cutter disc geometry. For the cutter layout, the optimisation process has been typically undertaken to (a) minimise eccentric forces (and therefore moments) of the whole system by maximising cutterhead symmetry, (b) maximise excavation efficiency by ensuring that adjacent cutters score the tunnel face successively and (c) minimise excavationinduced stress on the cutterhead (e.g. Ji et al., 2016). Other common constraints include the following: (a) cutter discs must remain contained within the cutterhead, (b) cutter discs must not overlap, (c) cutter discs must not interfere with manholes, ‘buckets’ or joints in the cutterhead and (d) cutter disc positions should be easily accessible for maintenance (Rostami and Chang, 2017).
An example optimisation documented by Huo et al. (2010, 2011) using a multiobjective GA and coevolutionary GA is presented in Figure 11. Those authors used three ‘base’ designs as the starting point for the optimisation to reflect current designs used in practice: a multispiral (Figure 11(a)), a ‘dynamic star’ (Figure 11(b)) and a stochastic pattern (Figure 11(c)). Another possible reason for the use of these base designs is that the results of the optimisation process were reported to be highly dependent on the initial cutter pattern. This was also discovered by Qi et al. (2013) using grey rational analysis (GRA). GRA is a form of grey system theory (proposed by Deng (1982)) and solves multiple attribute decisionmaking by combining the entire range of attribute values being considered for each alternative decision into a single value (Kuo et al., 2008). Those authors also found that the polar angle played a more important role on the cutter layout rather than the radial distance from the centre point of the cutterhead. Although not discussed in those studies, these findings suggest the occurrence of local optima in these optimisation problems. While multiple alternative optimisation algorithms exist (e.g. grid search, random search), Bayesian optimisation (Brochu et al., 2010) seems suitable for this problem given its robustness to local optima. This is due to its exploration against exploitation strategy: exploitation initially steers the search process into the direction of the local optima but exploration allows the algorithm to ‘escape’ from the local optimum towards finding an improved global optimum.
On the geometric design of individual cutters, Xia et al. (2012, 2015) used GA and multiobjective and multigeologic condition optimisation to optimise the (a) cutter cutting edge angle, (b) cutting edge width, (c) transition arc radius and (d) caulking ring width between bearings. The optimisation process sought to minimise the cutter bearing load.
This review has identified an increasing trend in the use of ML in the tunnelling space with a significant increase in 2019. It is likely that this trend will persist as advancements in ML continue to be translated into practical domains for routine use and more tunnelling data are shared with the academic community. ANNs have experienced sustained popularity in this area. This is not surprising as ANNs are one of the oldest ML paradigms and are able to capture complex nonlinear relationships and generalise within the trained parameter space. The second most popular technique is SVR/SVM. The nonparametric nature of these models means that model complexity remains relatively unaffected by an increase in the number of features, and these models are therefore particularly suited to highdimensional data sets. This may go some way to explaining their popularity, particularly for settlement predictions due to the larger number of influencing factors. These techniques have been typically coupled with optimisation algorithms to overcome the slow tuning process of the kernel hyperparameters.
The use of fuzzybased methods such as ANFIS and FL in this area stems from their ability to incorporate human experience and their ability to deal with imprecise and noisy data typical of construction monitoring projects. These methods have not experienced the same growth, which is probably due to the increase in ‘big data’ in tunnelling that lends itself to training more robust algorithms. It is also apparent that there has been a significant and recent increase in the use of alternative ML algorithms such as GEP and RF. These models provide a higher level of performance for the sake of model interpretability and can therefore capture highly nonlinear trends. The use of probabilistic ML techniques, such as Bayesian networks and GPR, for underground construction applications have become more popular in recent years – for example, the studies by Zhang et al. (2016), Wang et al. (2017), Chen et al. (2019d) and Zhu et al. (2019). These methods are well conditioned for dealing with noisy and incomplete data typical of a construction site and perform predictions within a principled framework. In light of this, they represent the most promising techniques for future applications of ML to inform tunnelling operations.
This paper has presented a comprehensive review of the literature exploring the use of ML to inform tunnelling operations. While ML has been used to inform a wide range of tunnelling applications, this review has identified four main areas of research – namely, TBM performance prediction, tunnellinginduced settlement prediction, geological forecasting and cutterhead design optimisation. Many studies have reported the successful application of ML techniques in tunnelling activities with high levels of accuracy. The most popular methods adopted in the literature include ANNs, SVM/SVR and fuzzybased methods. A clear trend is evident in the use of ML in tunnelling, and this trend is likely to persist as the volume of data produced by modern TBMs continues to grow and the use of ML becomes more commonplace. In most instances, investigators have used empiricism (from previous literature) as the basis for the selection of model inputs where the number of features varies considerably across the literature. As the number of parameters captured by modern TBMs grows, identification of the most appropriate features for training ML models using robust techniques should be central to future research.
Despite its recent advances, ML in tunnelling remains a young field with many underexplored research opportunities. Some of these opportunities can be observed by contrasting the methods reviewed in this study with those adopted in other disciplines such as aerospace, healthcare, robotics and automated vehicles (Mooney et al., 2020). In particular, there is a real need for continued application of ML methods employing more principled, probabilistic frameworks such as Bayesian networks and GPR. The problems covered by this review appear well suited to probabilistic frameworks given the uncertain nature of tunnelling operations and the prevalence of noisy data. This relieves engineers of onerous data preprocessing to denoise large training data sets. Furthermore, probabilistic frameworks provide a robust treatment of overfitting, meaning large data sets are not necessarily a prerequisite and deployment of these techniques on a sitespecific basis is feasible.
Another important finding of this review is that most of the studies reviewed here have been developed and validated against a single case history. Validation of these algorithms across a broader parameter space is warranted for the industry to gain confidence in these approaches. As tunnelling data become more accessible, it may also become feasible to interrogate large data sets intelligently for the most appropriate training data for a given project. This would allow the relative performance of ML techniques on future projects to feedback into the improvement of the training data sets. In addition, the highrisk nature of mistakes in the tunnelling industry means model interpretability is essential for takeup in practice to gain insight into the features driving predictions.
Graphical causal inference represents an exciting area for future research. Several authors have argued that some of the most challenging open problems of ML and AI are intrinsically related to causality – for example, Pearl (2000, 2014) and Schölkopf (2019). In particular, the ML models reviewed in this paper suffer from a lack of generalisation (e.g. transfer to new problems). This is because these models are trained only on the most relevant information to limit the associated computational cost. However, information essential for generalisation, such as interventions, domain shifts and temporal structure, is typically neglected. Schölkopf (2019: p. 1) argues that ‘causality, with its focus on modelling and reasoning about interventions, can make a substantial contribution towards understanding and resolving these issues and thus take the field to the next level’. The integration of causal modelling in ML thus represents a promising avenue for more robust treatment of uncertainty in practical domains. It appears essential for the tunnelling industry to begin to consider how best to leverage these recent advances in ML to inform tunnelling operations.
Acknowledgements
This project was supported by the Royal Academy of Engineering under the Research Fellowship scheme and by the Engineering and Physical Sciences Research Council (grant number EP/T006900/1).