DEEP LEARNING ALGORITHMS FOR STRUCTURAL CONDITION IDENTIFICATION WITH LIMITED MONITORING DATA

To obtain actual conditions of infrastructure assets and manage them more efficiently, extensive research efforts have been placed on structural health monitoring (SHM), especially those using data-driven methods. Recently, deep learning becomes a research hotspot in many application areas, including the SHM domain. Their performance largely relies on the quality and quantity of the training data, obtained either experimentally or numerically. Due to the time and expense restraints, field or laboratory test data are normally limited by the variation of structural conditions, while the quality of numerical simulation data is subjective to experts' modelling skills. Therefore, the actual performance of deep learning algorithms with limited training data needs to be studied, and the alternative ways to generate more training data need to be developed. In this work, we develop a new one-Dimensional Convolutional Neural Network (1D-CNN) for structural condition identification. A laboratory case study is conducted to evaluate the performance of the algorithm. A steel Warren truss bridge structure is constructed and instrumented with accelerometers and impact hammer. The vibration tests under seven different scenarios are conducted, and each scenario has five repeated test data. The algorithm is trained with different quantities of training data (from one test data to four test data for each scenario). The results show that condition identification results become reliable with at least three repeated test data. To overcome the challenge of limited monitoring data, we propose the potential application of Generative Adversarial Networks (GANs) to generate more reliable training data.


Introduction
The performance of civil infrastructure, including transport and energy infrastructure, is of great importance for a nation's economy and its people's quality of life. For example, bridges, which connect roads and/or railways over obstacles such as rivers or other roads, are regarded as a vital element of enabling a functioning economy. Yet they inevitably deteriorate over a long service period which is usually 50 to 100 years. To maintain and repair built infrastructure, the costs are significant, estimated at 20 per cent of the total construction costs (HM Treasury 2010). For better infrastructure asset management and budget allocation, the information about the actual structural conditions is indispensable. Thus, structural Health Monitoring (SHM) has been proposed and researched extensively worldwide in the past 20 years (Farrar and Worden 2013), and been applied to an increasing number of real projects (Brownjohn 2007). Through the sensors installed on structures, real-time monitoring data, affected by operational, structural and environmental conditions, can be collected. The monitoring data are expected to provide more detailed information regarding the actual conditions of a structural system compared to traditional inspection methods. Among the various sensors, the most mature sensing type is the vibrationbased method, which can be realised using accelerometers (acceleration) and/or optical fibre sensors (strain). However, the interpretation of vibration-based monitoring data, i.e. structural condition identification, remains a major challenge in practices.
The vibration data interpretation methods can be generally classified into either physics-based or data-driven. The former, which have been used predominantly in the last 20 years, involve the construction of physics-based numerical models to simulate structural performance, the calculation of one or more features based on both numerical model and monitoring data, and/or the updating of numerical models through an optimisation process to minimise the difference between the numerically derived features and those calculated based on the monitoring data. Despite their popularity, physics-based methods face two main challenges: first, it is often difficult to find a feature that is sensitive to structural conditions while insensitive to the noise and uncertainties from different sources, such as materials, geometry, environment, and model. Second, such methods suffer from relatively low computational efficiency due to their reliance on complex simulation models. To address these challenges, efforts have switched into datadriven approaches in recent years, where condition identification can be achieved through pattern recognition using machine learning algorithms. Using such methods as Wang and Hao (2015), the features can be generated automatically and may achieve better structural condition identification results than traditional methods, while computational costs can be significantly reduced. The main challenge for the existing data-driven condition identification methods is that they often lack the complexity embedded in numerous and diverse scenarios in real structures, considering different possible conditions, environmental factors, and loading histories.
Recently developed deep learning methods (LeCun et al. 2015) enable the modelling of complexity through multiple learning layers. They have been successfully applied to many challenging areas and have attracted significant scientific interest (LeCun et al. 2015, Silver et al. 2016), e.g. image understanding, language processing, and the game of Go. In SHM domain, the application of deep learning algorithms has gained increasing yet still limited research attention (Cha et al. 2017, Abdeljaber et al. 2017, Abdeljaber et al. 2018, Pathirage et al. 2018, Bao et al. 2018. The existing studies can be categorized into two groups. The first group is a direct adaptation from computer vision application, i.e. detecting different structural conditions based on image analysis (Cha et al. 2017, Bao et al. 2018. The second group is to construct a machine learning algorithm based on a training set of vibration data under different scenarios (Abdeljaber et al. 2017, Abdeljaber et al. 2018, Pathirage et al. 2018. These methods are mainly adapted from the algorithms in image and video recognition domains, i.e. auto-encoder method and convolutional neural networks (CNN), which classify the input data by computing the features of these data and comparing them with those of existing data. However, the above-mentioned methods, and other data-driven SHM methods, almost always suffer from the lack of training data. Two important questions are: 1) how many training data are needed for the reliable training of a deep learning algorithm? 2) Are there any alternative approaches capable of generating reliable training data for structural condition identification?
In this study, we aim to answer the above two questions. Firstly, a novel structural condition identification framework based on 1D-CNN is proposed, which directly uses time-domain vibration data for training purposes. Secondly, the framework is tested through a laboratory case study (a Warren truss bridge with different levels of connection damage), with different quantities of training data. Finally, the alternative ways to generate synthetic data from limited monitoring data are discussed, aiming to improve the identification performance with limited training data. The paper will conclude with conclusions and future recommendations.

Structural Health Monitoring using time-domain vibration data
Compared to the commonly used frequency domain method, time domain structural damage identification methods (Wang et al. 2013, Ay and Wang 2014, Ay et al. 2018 essentially contain all the vibration information, including non-linear and transient effects which are often missed by the former, while they are more computationally efficient, as no need for domain transformation. Therefore, in this study, we choose to use time domain vibration data directly as the training data for machine learning algorithms. From the machine learning perspective (Farrar and Worden, 2013), the fundamental research hypothesis of SHM is that the monitoring data embody various patterns under different structural conditions and that given a particular monitoring data set, the structural condition can be identified through pattern recognition. Specifically, the problem is to find the pattern ̃ (output) for a new monitoring data set ̃ (input), given a system composed of the monitoring data set and its corresponding condition/pattern label set ∈ (1, ). This is achieved by assigning the label of an existing vector in which has the least difference with ̃ by using different machine learning algorithms. To make the training data generation more systematic, the patterns can be defined into three levels: damage type, location and severity (Wang and Hao, 2015).
Indeed, the possible structural damage scenarios are essentially infinite, and thus the monitoring data for training purposes are inevitably incomplete. Therefore, how to classify new monitoring data (that may not belong to any existing pattern label) with limited training data is a critical challenge for the data-driven SHM approaches.

Monitoring data interpretation using 1D-CNN
In this study, we developed a new structural condition identification framework using 1D-CNN. Figure 1 shows the flowchart of this approach. In such a framework, either the real data or the transformed data will be used to train a discriminative neural network for structural condition identification. In this work, we use the time-domain data directly (option 1) as the input of the 1D-CNN algorithm. Its identification results represent the structural conditions that belong to the input data.
There are many CNN models available. Ideally, with the deeper and wider of the convolution layers, the network will perform better. Here, we constructed a 1D-CNN based on the Alex-Net (Krizhevsky et al. 2012) for efficient training and testing. In particular, we adjusted all the 2D layers to 1D and used Adam (Kingma and Ba 2014)

Laboratory case study
Truss structures are world-widely used as highway and railway bridges, while the identification of their conditions is challenging because they involve a large number of members. Therefore, to demonstrate the effectiveness of the proposed algorithm, a steel Warren truss bridge scale-model is selected as the test sample. Particularly, the proposed 1D-CNN will be first trained using data acquired from the test. Compared with the use of numerical simulation data (Wang andHao 2015, Pathirage CSN et al. 2018), this can avoid the modelling error. Then, it will be used to identify structural conditions based on the time-domain experimental data.
The scale model was built in the laboratory according to AS/NZS1163:2009. As shown in Figure 2, it is a single span bridge with eight equilateral triangular sections, with the total length of 5.5m and the width of 0.65m. For simplicity of design and assembly, the cross-section of all the structural members is designed identically, a square hollow section with dimensions 30×30×3mm. The lengths of the structural members are in three sets, 500mm for floor deck beams, 600mm for equilateral truss members, and 800mm for lateral floor deck bracing. All the splice brackets are identical in dimensions, with a thickness of 5mm and two M10 bolts per member end. All the bolts are high tensile ISO Grade 8.8, featuring lock-nuts and serrated washers to prevent unwanted rattling or loosening during the testing. The test structure was supported by two steel sawhorses bolted onto the strong floor in the laboratory, which can ensure stable supports. The boundary conditions were set as fixed-fixed, which was achieved by clamping the end splice brackets to the sawhorses, using C-clamps.

Figure 2 Test schematic
The Endevco 2304 impact hammer (i) was used to give strikes on the structure at the specified point labelled as the red point in Figure 2. Three Endevco 61C13 accelerometers were placed at the bridge deck to record the vibration of the structure induced by the strike. They (the impact hammer and accelerometers) were connected to a high-frequency dynamic data acquisition card, National Instruments PXIe-4492, within the equipment, National Instruments PXIe-1078 (ii). A list of the damage scenarios with respect to the corresponding connection groups is shown in Table 2. The first scenario (S0) is that all bolts within the test structure were fully tightened to a torque of 25 , which was measured using a torque wrench. This represents the baseline or intact state. The following damage scenarios (S1-S6) were designed as the loosening of test bolts in sets of four, in an accumulating manner. This design aims to represent actual deteriorating structural conditions with an increasing level of connection failures. The loosening was operated manually and confirmed by using the wrench. The loosened bolts were labelled as connection group C1-C6 in Figure 2, which were in the vicinity of the sensors to represent the case when a densely distributed sensor network is available. Five repeated experiments were performed per damage scenario, and thus each damage scenario involves five data sets. More details about the test can be found in Ay (2017).

Results
In this work, only the response data from Accelerometer 1 were used as training data. To examine how many training data can deliver reliable networks for structural identification, we trained 1D-CNN for four times, with an increasing number of experimental data sets as training data, i.e. 1, 2, 3, and 4. Figure  3 shows the evolution of training accuracies over 500 iterations. It can be seen that the training speeds for 1, 2, 3, and 4 sets of test data are almost identical. Initially, the increase rate of training accuracy is very fast. Within 100 iterations, the training accuracy reaches 80%, and it exceeds 90% within 200 iterations. Afterwards, the training accuracy becomes steady and reaches 98% at around 400 iterations. To allow for the consideration of noise, we do not aim to achieve 100% accuracy in training. Therefore, the parameter setting for this case, i.e. 500 iterations, is appropriate. Figure 4 shows the condition identification accuracies using 1-4 experimental data sets. Previous studies normally chose leave-one-out fashion to validate the performance of a structural damage identification algorithm. In such a case, 4 experimental data sets are used as training data, and 1 set is used as test data. Using the proposed approach, 100% accuracy is achieved under this condition, meaning that all the damage scenarios can be correctly identified. This can demonstrate the effectiveness of the proposed framework. With fewer training data, the identification accuracy decreases as expected. With only one set of experimental data as training data and four sets as test data, only 50% of all the scenarios can be identified correctly. With two and three sets of data, the accuracies are 85% and 92%, respectively. The results clearly demonstrate the importance of the training data. If less than three sets of repeated experimental data exist, it is very hard for a machine learning algorithm to learn a model that is able to classify structural conditions accurately.

Discussions on the training data
Due to the importance of the training data, repeated experiments are needed for reliable structural condition identification outcomes. However, this may not be feasible for a real SHM system. As the data received every day are basically from the same intact structure condition, meaning that there is no data for other potential damage scenarios. Although numerical simulation results can also be used to train the networks, it is not always straightforward to simulate complex structural behaviours. Therefore, the difference between numerical results and real structural behaviours may be significant, which could lead to unreliable or even wrong structural condition identification results.
Furthermore, due to the different environmental factors, such as temperature, humidity, wind speed, etc., the real data even received on the same day cannot be regarded as repeated, in contrast to the data received at the laboratory which have the same test conditions. The proposition of generative adversarial networks (GANs) (Goodfellow et al. 2014) provides an innovative way of constructing training data set based on limited test data through the training of two neural networks, i.e. the generator and the discriminator. The aims of both networks are the opposite: the former tries to generate synthetic data from real data, while the latter tries to distinguish them. Through the evolution of both, the synthetic data generated by GANs become more similar to the real data, while the identification becomes more effective. They have been extensively researched to produce high-quality images.
Based on the above development, a new framework is proposed in the context of structural monitoring data interpretation. As shown in Figure 5, it is formulated in two stages: data generation and condition identification. In the first stage (labelled as blue lines), the available monitoring data in the time domain are transformed into a sparse domain, normally frequency domain. Then, a generative neural network is constructed to produce new data sets with random noises based on transformed data. In the second stage (labelled as black lines), both the real data and generated data will be used to construct a discriminative neural network, which aims to find more effective features for structural identification, even under high noise level.
The main difference between the existing framework ( Figure  1) and the proposed method ( Figure 5) is that the latter can generate more synthetic data for training purposes. Thus, it is expected that the latter will enhance the robustness and effectiveness of structural condition identification.

Conclusion
This paper proposed a novel structural condition identification framework, using a 1D-CNN algorithm based on Alex-net and Adam optimisation. A Warren-truss steel bridge was constructed in the laboratory. The repeated impact hammer tests were performed on the bridge under six scenarios with different levels of connection damage plus the intact scenario. The structural condition identification results demonstrate: 1) The proposed 1D-CNN framework is very effective in structural condition identification, with 100% identification accuracy using the normal leave-one-out fashion.
2) With fewer training data, the structural condition identification performance inevitably degrades. If only one experimental data set is used for training, the identification accuracy becomes 50%. This confirms the importance of massive training data.
3) This paper proposes to use GANs to develop a novel structural condition identification framework, which can generate synthetic data to complement the limited monitoring data. It is expected to largely enhance the performance of structural condition identification. Figure 5 Structural condition identification framework with GANs