Ultra-high resolution strain sensor network assisted with an LS-SVM based hysteresis model

Optical fiber sensor network has attracted considerable research interests for geoscience applications. However, the sensor capacity and ultra-low frequency noise limits the sensing performance for geoscience data acquisition. To achieve a high-resolution and lager sensing capacity, a strain sensor network is proposed based on phase-sensitive optical time domain reflectometer (φ-OTDR) technology and special packaged fiber with scatter enhanced points (SEPs) array. Specifically, an extra identical fiber with SEPs array which is free of strain is used as the reference fiber, for compensating the ultra-low frequency noise in the φ-OTDR system induced by laser source frequency shift and environment temperature change. Moreover, a hysteresis operator based least square support vector machine (LS-SVM) model is introduced to reduce the compensation residual error generated from the thermal hysteresis nonlinearity between the sensing fiber and reference fiber. In the experiment, the strain sensor network possesses a sensing capacity with 55 sensor elements. The phase bias drift with frequency below 0.1 Hz is effectively compensated by LS-SVM based hysteresis model, and the signal to noise ratio (SNR) of a strain vibration at 0.01 Hz greatly increases by 24 dB compared to that of the sensing fiber for direct compensation. The proposed strain sensor network proves a high dynamic resolution of 10.5 pε·Hz -1/2 above 10 Hz, and ultra-low frequency sensing resolution of 166 pε at 0.001 Hz. It is the first reported a large sensing capacity strain sensor network with sub-nε sensing resolution in mHz frequency range, to the best of our knowledge.


Introduction
Discovering the physics and dynamics of the solid earth is essential to meet the challenges of natural hazards, energy crisis, and climate changes. Solid earth geoscience is a data-driven filed which demands large datasets of geological events 1 . For geoscience data acquisition, it is required to sense the earth's crustal deformation with a strain resolution less than 10 -9 ε in a frequency region from static to 100 Hz 2 . Moreover, in order to refine the structure image of the errors induced by the Earth crustal damage, the strain sensing for geoscience research requires denser spatial coverage of individual sensors and more accurate recording instruments 3 . The broadband seismometer networks represent the state of arts technology and provide high quality Earth crustal deformation waveforms with broadband frequency content 4 . However, the dense seismometer networks deployed in areas where access and power supply are limited will consume a great effort and resources 5 .
Recently, optical fiber strain sensor has attracted considerable research interests for geoscience applications, owing to their unique advantages of passive, multiplexing, and anti-electromagnetic interference 6 . For instance, optical fiber Bragg grating (FBG) based sensors have been widely exploited to monitor the earth tide, which possess a minimum static strain resolution of 270 pε 2 . However, the multiplexing capacity of the FBG strain sensors is limited by the bandwidth of the reflection spectrum. Moreover, the signal crosstalk between the sensors in the FBG sensor network will increase the noise floor of measurement 7 . Hence, the sensing range and spatial coverage density of the high-quality single point optical fiber sensor networks should be further promoted.
As an alternative approach, optical fiber distributed acoustic sensor (DAS), which offers a long sensing distance without blind area 8,9 , has demonstrated efficient seismic acquisition ability in seismology application such as natural hazard prediction and crust exploration 10,11 . In general, DAS, which is realized based on phase-sensitive optical time domain reflectometry (φ-OTDR) technology, can interrogate the strain and temperature variations along the entire length of an optical fiber 12,13 . With current technologies, DAS has achieved a dynamic strain resolution down to nano-strain level 14 . Moreover, the quasi-DAS assisted with ultra-weak FBG array has been proposed with pico-strain dynamic resolution above 10 Hz frequency range 15,16 . Due to the high sensitivity, high spatial and temporal resolution, DAS has been developed and used for broadband seismic monitoring in oil/gas exploration 17 . Nevertheless, the frequency shift of the laser source, as well as the cross-sensitivity between temperature and strain, severely ruin the signal to noise ratio (SNR) of the DAS in low frequency band 18,19 . The noise in ultra-low frequency band of the DAS system limits the availabilities for nature earthquake recording and earth tide monitoring. In order to improve the sensing performance in low-frequency domain, many efforts have been done to suppress the phase drift of the φ-OT-DR. F. Zhu proposed an active compensation method to reduce the laser frequency shift influence by tracking and comparing the OTDR intensity traces with different frequencies 19 . But the response time of this method will be limited by the number of tuning frequencies. Furthermore, Q. Yuan has demonstrated the twice differential method to compensate the influence of laser frequency shift, and successfully detected the strain variation with amplitude of 5.9 nε at 0.1 Hz 20 . However, the strain resolution of the DAS in the mHz range is still much worse than the high-quality single point optical fiber sensor. Actually, the quasi-static measurement from mHz to Hz is essential to the solid earth geoscience research. To sense the mHz strain information with high sensitivity and dense spatial coverage, it is necessary for the ultralow frequency noise of the DAS to be greatly suppressed.
In this paper, we demonstrated a high-resolution strain sensor network with large sensing capacity based on φ-OTDR technology and scatter enhanced points (SEPs) array. In the prototype system, the optical fiber with SEPs array is divided into sensing fiber and reference fiber. The reference fiber packaged with strain isolation is used to compensate the ultra-low frequency noise induced by laser source frequency shift and environmental temperature change. The characteristics of the above phase noise are systematically analyzed. Then, a hysteresis model based on the hysteresis operator and LS-SVM is introduced to reduce the thermal hysteresis between the sensing fiber and reference fiber. Assisted with the LS-SVM based hysteresis compensation scheme, a strain sensor network with high resolution in the ultralow frequency and large sensing capacity is successfully achieved.

Experimental setup
In order to achieve a large sensing capacity, the proposed high-resolution strain sensing network is interrogated by the φ-OTDR technology. The schematic diagram of the proposed sensing network for geoscience research is illustrated in Fig. 1. The sensing network consists of an interrogation office center and a special packaged sensing cable. In the interrogation system, an ultranarrow linewidth laser (NKT x15) with linewidth of 100 Hz is used as a light source which is divided into a local oscillator laser and a probe laser by a 99:1 optical coupler. The probe laser is modulated into a series of pulses with frequency shift of 200 MHz by the acoustical optical modulator (AOM). After amplified by erbium doped fiber amplifier (EDFA), the pulses are injected into the sensing cable by a circulator. The backscattered light from the sensing cable is mixed with the local oscillator laser by a 50:50 coupler and then received by a balanced photodetector (BPD). Data acquisition card (DAQ) is used to acquire the heterodyne beat signals from the BPD.
The sensing cable is based on the specific packaged optical fiber with SEPs array 21 . The SEPs are inscribed by laser exposure on the single mode fiber (SMF) with a given interval. The scattering efficiency of the SEPs is improved by 10 dB compared to the Rayleigh scattering of the SMF 22 . A pair of SEPs along with the fiber between them compose one sensing section. The fiber with SEPs array is divided in to sensing fiber and reference fiber with the same SEPs number 23 . The sensing fiber is for the sensing of strain, and the reference fiber is for the compensation of the phase drift induced by the environmental temperature and laser frequency variation. The sensing sections in the sensing fiber and reference fiber are respectively defined as sensing channels and reference channels. The SEPs of the sensing channels and the corresponding reference channels are aligned, and fixed with the sensing cable by epoxy curing agents. The sensing fiber in the cable is applied with a pre-strain for sensing of the strain of the cable. While the reference fiber is loosened in the cable to isolate the strain of the cable. Generally, the different packaging method will induce the length difference between the sensing fiber part and reference fiber part, which will introduce the different sensitivity for temperature and laser frequency vibration as well. However, the sensitivity difference between the sensing fiber part and reference fiber part is linear, which can be eliminated by linear fitting. When the spatial width of the injected probe pulse is shorter than the interval between two SEPs of the sensing section, the detected interference signal behaves as a series of beat frequency pulse trains, which can be described as: where A is the amplitude of the interference signal, N is the number of the SEPs in the fiber, n 0 is the refractive index of the fiber core, c is the light speed in the vacuum, τ p is the pulse width of the probe laser, z n is the location of the n-th SEPs in the fiber, and φ n is the phase of the scattering light of the n-th SEP. φ n can be demodulated from the interference signal by the digital orthogonal phase discrimination method, which is deduced as: where I n (t) is the interference signal scattering from the n-th SEP. Theoretically, the phase difference between two SEPs is only dependent on the change of optical path in the sensing section 24 , which can be obtained by subtracting the phase of two SPEs in the sensing section, as expressed blow: where k=1,2, ···, N−1 is the number of the sensing section in the fiber, ν is the optical frequency of the laser, and ε k is the strain applied on the k-th sensing section. When ν and n 0 are constant, the phase signal in the k-th sensing section will be only proportional to ε k . However, it can be observed clearly that the phase signal of the sensing section φ(k) will be also influenced by variation of the laser frequency ν and refractive index n 0 . Generally, the laser frequency ν will shift along with the environmental temperature drift, and the refractive index n 0 is also significantly influenced by the temperature variation of the fiber core. Thus, the laser frequency shift and fiber temperature fluctuation are the dominant noise sources in the ultra-low frequency domain.
In the proposed strain sensor network, a reference fiber with strain isolated package is employed to compensate the phase noise in low frequency band induced by environmental temperature and laser frequency shift.

200037-3
The compensation efficiency is dependent on the phase relationship between the sensing channel and corresponding reference channel. In order to investigate the phase variation characteristics, the laser frequency variation and temperature change were applied on the sensing cable for test. When the laser frequency variation with amplitude of 600 MHz and frequency of 1 Hz was introduced by frequency demodulation, the phase change of the sensing part and reference part are respectively plotted in the Fig. 2(a). The phase waveforms of the sensing channel and reference channel are identical with the variation of the laser frequency. Moreover, the relationship of the phase change between the sensing channel and reference channel is linear, which is illustrated in Fig. 2(b). Therefore, the phase noise induced by laser frequency shift can be compensated by direct differentiation. Furthermore, the temperature fluctuation and the corresponding phase change are plotted in the Fig. 3(a), where the trigonometric temperature fluctuation with the amplitude of 5 °C was applied by a TEC. From Figs.

3(b) and 3(c)
, the hysteresis phenomenon can be observed between the temperature and phase. Moreover, the relationship between the phase change of the sensing channel and reference channel also exhibits an obvious hysteresis characteristic, as shown in the Fig. 3(d). The hysteresis stems from different temperature features induced by the different package methods, which will bring about a significant compensation error by direct differentiation. Therefore, a precision hysteresis model between the sensing channel and reference channel should be built to reduce the temperature compensation error.

Hysteresis operator
Research on the hysteresis has a long history about a century. The Preisach model, which has been widely used for the modeling of hysteresis, has been developed based on incorporates relay operators 25,26 . Relay operator is the simplest example for a hysteresis nonlinearity, which is characterized by threshold a 1 <a 2 and output values +1 and -1, as depicted in Fig. 4(a). For a given input x(t),  the output of the Preisach model y(t) can be expressed as 27 : is the relay operator, r=(a 2 −a 1 )/2 is the half-width value, and s=(a 1 +a 2 )/2 is the mean value of the relay operator. A dividing curve, ψ(r, t), is defined to divide the relay operator into two distinct regions. The output value of the relay operator is +1 in the region below the curve and -1 in the region above the curve, respectively. The dividing curve ψ(r, t) can be considered as the play between two mechanical elements of relay. Therefore, a play operator P r [x] is determined, and the corresponding input-output behavior is illustrated in Fig. 4(b). Mathematically, the play operator can be obtained by linear superposition of relay elements as follows 27 : Therefore, the Preisach operator can be expressed as: where q(r, s) = 2 s 0 w(r, σ)dσ .
According to the general form in Eq. (7), the Preisach model can be decomposed into a continuous set of play operators and a one-to-one mapping function.  Generally, the input-output relationship of the Preisach model can be defined as: where F(•) is a function that maps the play operators set in the real-valued output y(t). Thus, the Preisach model can be implemented by approximating the mapping function based on the input and output value of real hysteresis materials through some identification methods, such as nonnegative least squares.
Least square support vector machine model Least square support vector machine (LS-SVM) has been successfully introduced in modeling application for the global optimization property 28 . Compared to the standard SVM, LS-SVM uses the least squared loss function instead of the ε-insensitive loss function. Hence, the analytical solutions of LS-SVM can be obtained by solving linear equations instead of quadratic programming problem. For a given training data set {(u i , y i ), u i ∈R n , y i ∈R}, the LS-SVM model is expressed as: where f(u) is the nonlinear mapping from the input space to higher feature space, w is the weight vector, and b is the bias. The optimization problem in the LS-SVM is obtained through the equation: where C is the regularization parameters, and ξ i is the error between the actual output and predictive output. The problem can be solved by defining the Lagrangian equation as follow: where a i is the Lagrangian multiplier. The conditions for optimality are the saddle point of L, which can be found by solving the partial derivatives. After eliminating the w and ξ i , the solution is given by the following linear equations: where e=[1, ···, 1] T , y=[y 1 , ···, y n ] T , a=[a 1 , ···, a n ] T are the support vectors, I is the identity matrix, and the Gram matrix Ω is defined as follows: where K(u i , u j ) is the kernel function, which can be used to avoid the computation of the mapping function f(u). Thus, the LS-SVM model for function regression can be formulated as: where a k and b are the solutions of the Eq. (13). The kernel function must satisfy the Mercer condition 29 . The radial base function (RBF) kernel is widely used in nonlinear function regression, which is expressed as: where σ is the width parameter of the KBF kernel. Therefore, for preassigned hyperparameter σ and C, the mapping function can be determined through Eq. (13). Moreover, the hyperparameters σ and C determine the generalization ability and complexity of the LS-SVM model, which should be optimized to ensure a high modeling accuracy along with high testing accuracy.

LS-SVM based hysteresis model
To establish a Preisach model, the density function d(r, s) of the model should be approximated from real hysteresis loops. In the classical Preisach model, the model parameters can be tuned by two approaches. One approach is predefined mathematical forms with a few free parameters 30 , and another approach is using the numerical density function whose values are identified by the method of nonnegative least squares 31 . However, both of the two methods are complex and time-consuming. As an alternative approach, the LS-SVM is an adaptive approximation strategy, which can be used as a general function estimator to approximate the Preisach model mapping function F(•). The LS-SVM based hysteresis model consists of two parts: a discrete play operator, and a memoryless mapping function, which are exhibited in x (t) where |x| max is the maximum absolute value of the input x(t). The number of the hysteresis play operator m determines the modeling accuracy and complexity of the hysteresis model. Additionally, due to that the classical Preisach model is a rate-independent hysteresis model, the rate of the input signal is included in the model so that the model can be used to estimate the rate-dependent hysteresis behavior. Therefore, the input vector of the LS-SVM u consists of m play operators, input signal x(t), and the rate of input . Formally, the output function of play operator , where x(t) is piecewise monotonic on 0=t 0 <t 1 <t 2 <…<t N =t E , is inductively defined by 27 : where p r (x, y) = max{x − r, min{x + r, y}} , (18) where r is the threshold value of the play operator. After converting the model input x(t) into the discrete paly operators, the operator vectors set are used to obtain the memoryless function F(•) through the memoryless mapping function, which is established by a LS-SVM. The LS-SVM is trained though the Eq. (13) with the given training data. The hyperparameters σ and C of the LS-SVM are tuned by the k-fold cross-validation to ensure the high generalization ability of the hysteresis model.

Results and discussion
In the experiment, an optical fiber with 112 SEPs array is enclosed in the sensing cable with the packaging method as shown in Fig. 1. The SEPs array possess -45 dB reflectivity with an interval of 5 m. When a probe laser pulse is injected into the sensing cable, the coherent beat signals backscattering from the sensing cable are recorded in Fig. 6(a). It is obvious the backscattering signal has 112 stable beat frequency pulses with high SNR, corresponding to 112 SEPs. The former 56 beats frequency pulses scattering from the SEPs belong to the sensing fiber and the latter part come from the reference fiber. As the sensing principle describes above, the 56 beat frequency pulses from the sensing fiber or reference fiber can compose into 55 sensing channels S i (1≤i≤55) and 55 reference channels R i (1≤i≤55), respectively. The sensing channel S i and the corresponding reference channel R j (j=56-i) are combined into a sensor element. Therefore, the proposed strain sensor network can interrogate a sensing cable with 55 sensing elements. To calibrate the strain sensitivity of the sensor elements, the static strain from 2 με to 20 με with a step of 2 με is applied to the sensor element 28 at the end of the sensing cable by a precision displacement stage. The relationship between the phase change and the applied strain amplitude is illustrated in Fig. 7, which presents a high sensitivity of -47.04 rad/με and high precision with R-square over 0.999. To train the hysteresis model between the sensing channel and reference channel, a triangular-wave Input Discrete play operator Memoryless mapping function Output  temperature with the same period and variable rate was applied to the sensor element 28, which is shown as the red line in Fig. 8(a). The triangular-wave temperature fluctuation used to train the LS-SVM model can provide a temperature change with different input rate, which can construct a complete set of training data. The corresponding phase of the sensing channel and reference channel are also plotted in the Fig. 8(a), respectively. The relationship between the sensing channel and reference channel is illustrated as the color scatter distribution in Fig. 8(b). The hysteresis loops curve reveals that the thermal hysteresis characters of the strain sensor is dependent on the change rate of the applied temperature. The larger change rate of the temperature will introduce the wider hysteresis loop. Therefore, the thermal hysteresis of the proposed strain sensor is a rate-dependent hysteresis model. The phase signal of the reference channel is set as the input of the proposed LS-SVM based hysteresis model, while the phase signal of the sensing channel is the output. The regression result of the LS-SVM based hysteresis model is plotted as the black line in Fig.  8(b). The results prove the highly regression accurate for the thermal hysteresis modeling. The modeling error of the LS-SVM based hysteresis model is also exhibited in the Fig. 8(c). Compared to the large compensation error of the direct differential method, the root mean squared error (RMSE) between the LS-SVM model output and phase signal of the sensing channel is less than 0.095 rad.

x(t) y(t) LS-SVM
After training the hysteresis model by the hysteresis operator LS-SVM, the phase signal of the sensing channel is compensated by the trained hysteresis model. To estimate the compensation accuracy of the hysteresis model, the sensor element 28 is applied with a temperat-ure fluctuation with amplitude of 5 °C, and strain signal with frequency of 0.01 Hz and amplitude of 20 nε. The original signals of the sensing channel and reference channel are plotted in the Fig. 9(a). The compensation result of the directly differential method and hysteresis model are illustrated in Fig. 9(b), respectively. It can be seen that the measured strain signal by the direct differentiation method is almost buried in the phase noise induced by the temperature fluctuation. However, the strain signal can be observed clearly in the phase of the sensing channel compensated by the hysteresis model. Moreover, the power spectrum density (PSD) of the original signal, compensated signals by direct differential method and hysteresis model is exhibited in the Fig. 9(c). In the PSD of the original signal and signal compensated by direct differentiation, the signal peak of strain vibration at 0.01 Hz is almost buried in the noise peaks induced by the temperature fluctuation. However, after compensating by LS-SVM based hysteresis model, the signal of the strain vibration can be obtained more clearly, and the signal to noise ratio (SNR) of a strain vibration at 0.01 Hz greatly increases by 24 dB compared to that of the sensing fiber for direct compensation. Moreover, the bias drift induced by the laser frequency shift is also suppressed compared with the uncompensated original signal.
Additionally, to evaluate the strain resolution of the proposed strain sensor network, the sensing cable is placed in a quiet environment and isolated with vibration and strain. The original phase signals of the sensing channel and reference channel from sensor element 28 are plotted in Fig. 10(a), where a severe phase drift induced by the environment temperature change and laser frequency variation is observed. The compensated phase in time domain and frequency domain are illustrated in Figs. 10(b) and 10(c) respectively. The sine wave phase noise in the sensing channel and reference channel with frequency nearly 0.001 Hz is introduced by the air conditioner system in the lab, which will control the room temperature with fixed period. The bias drift of the sensing channel in time domain is eliminated by compensating with the LS-SVM based hysteresis model. In the frequency domain, the noise PSD below 0.1 Hz is suppressed clearly by compensation. Compared to the compensation method by direct differentiation, the noise of the compensated sensor by LS-SVM hysteresis model is  further suppressed by 20 dB below 0.01 Hz. For compensation by hysteresis model, the phase power spectral density of the compensated sensor element is better than S p (f)=5.01×10 -4 rad·Hz -1/2 in the frequency range above 10 Hz, corresponding to a minimum dynamic strain resolution of 10.5 pε·Hz -1/2 . Moreover, the phase noise power spectral density below 0.01 Hz even at 0.001 Hz is superior to 0.251 rad·Hz -1/2 , which suggests the ultra-low frequency strain resolution of 166 pε @ 0.001 Hz. Remarkably, the quasi-static sensing resolution in mHz range of the strain sensor network can compete with the high-quality single point fiber sensor. Moreover, the sensor network proves a 55 sensor elements access capacity in the experiment and possesses a potential sensing capacity up to thousands of sensing channels which can match with the DAS system. It is the first reported strain sensor network to have sub-nε resolution in mHz frequency range along with ultra-large sensing capacity to the best of our knowledge. The large sensing capacity along with ultra-high quasi-static sensing resolution makes the proposed strain sensor network has potential applications in geoscience research.