Classifying Lensed Gravitational Waves in the Geometrical Optics Limit with Machine Learning
Classifying Lensed Gravitational Waves in the Geometrical Optics Limit with Machine Learning
Singh, Amit Jit;Li, Ivan S. C.;Hannuksela, Otto A.;Li, Tjonnie G. F.;Kim, Kyungmin
2018-10-18 00:00:00
Classifying Lensed Gravitational Waves in the Geometrical Optics Limit with Machine Learning ;a b a a a Amit Jit Singh* , Ivan S.C. Li , Otto A. Hannuksela , Tjonnie G.F. Li , and Kyungmin Kim Department of Physics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Department of Physics, Imperial College London, London SW7 2AZ, United Kingdom Students: singhamitjit@link.cuhk.edu.hk*, ivan.li16@imperial.ac.uk Mentors: otto.hannuksela@link.cuhk.edu.hk, tgfli@cuhk.edu.hk, kyungmin.kim@cuhk.edu.hk ABSTRACT Gravitational waves are theorized to be gravitationally lensed when they propagate near massive objects. Such lensing effects cause potentially detectable repeated gravitational wave patterns in ground- and space-based gravi- tational wave detectors. These effects are difficult to discriminate when the lens is small and the repeated patterns superpose. Traditionally, matched filtering techniques are used to identify gravitational-wave signals, but we in- stead aim to utilize machine learning techniques to achieve this. In this work, we implement supervised machine learning classifiers (support vector machine, random forest, multi-layer perceptron) to discriminate such lensing patterns in gravitational wave data. We train classifiers with spectrograms of both lensed and unlensed waves us- ing both point-mass and singular isothermal sphere lens models. As the result, classifiers return F scores rang- ing from 0:852 to 0:996, with precisions from 0:917 to 0:992 and recalls ranging from 0:796 to 1:000 depending on the type of classifier and lensing model used. This supports the idea that machine learning classifiers are able to correctly determine lensed gravitational wave signals. This also suggests that in the future, machine learning clas- sifiers may be used as a possible alternative to identify lensed gravitational wave events and to allow us to study gravitational wave sources and massive astronomical objects through further analysis. KEYWORDS Gravitational Waves; Gravitational Lensing; Geometrical Optics; Machine Learning; Classification; Support Vector Machine; Random Tree Forest; Multi-layer Perceptron INTRODUCTION A total of six gravitational wave (GW) signals have been detected by LIGO and Virgo detectors at the time of writing. The source of these signals were believed to originate from merger events of binary systems, such as bi- 1–5 6 nary black holes (BBH) or binary neutron stars (BNS). With these recent detections, we now have the oppor- tunity to study astronomical objects and events through GW observations in addition to the traditional methods of electromagnetic (EM) wave observations. When a GW signal passes by massive objects, the incoming wave behaves similarly to light in that the signal be- comes gravitationally lensed (as shown in Figure 1). This changes the amplitude of the detected signal, and can cause multiple images from the same GW source to be detected at different times. We consider two lensing mod- 8, 9 els in this study: the point mass lens and the singular isothermal sphere (SIS). Our aim is to classify an incom- ing GW signal as lensed or unlensed through machine learning techniques. By doing so, we may be able to better understand the properties behind the lens mass and GW source through further analysis of the GW signal. Currently, the main method of identifying GW signals are through matched filtering techniques with pre-existing 1–6, 10 post-Newtonian GW waveform models. Some studies also suggest that matched filtering techniques can be arXiv:1810.07888v2 [astro-ph.IM] 19 Oct 2018 Gravitational Wave Binary Earth Lens D D LS L Figure 1. A simplified diagram of gravitational lensing of GWs. The binary system is the source of the gravitational waves. " is the distance of the binary from the line of sight extending from the lens and Earth. D is the distance from the source to the lens and LS D is the distance from the lens to the Earth. is the Einstein radius of the lens. L 0 1, 11, 12 used to obtain useful information such as the binary parameters of the GW source. However, for this study, we focus on the ability of supervised machine learning classifiers (MLC) to determine if an incoming GW signal as a number of difficulties arise with the traditional match filtering techniques. With match filtering, we require ac- curate templates of the waveforms to compare with detected signals. However, the current lensing models are not exact and thus their templates may not match with actual lensed events, making the method relatively imprecise. Furthermore, unlike unlensed GWs, we would require significantly more templates for lensed GW due to the dif- ferent number of lens that exist and the increased number of physical parameters that come with lensing. On the other hand, machine learning does not require us to have a exact model and we do not have to concern ourselves with creating templates with various parameters due the probabilistic nature of the machine learning classifiers. In particular, we choose to test three popular supervised MLC algorithms: support vector machine, random for- 14 15 est, and neural network. In this study, we generate spectrogram samples of lensed and unlensed GW signals under both the point mass lens and SIS models. We use these spectrogram images as the training and testing data for these MLCs, and we ana- lyze the performance of each MLC in the identification of lensed GWs by comparing the F scores of each classi- fier. Gravitational Lensing of Gravitational Wave Signals 9, 17–23 Models for lensing of GWs have been studied for decades. From these studies, it is known that the lensing effect needs to be treated differently depending on the wavelength, , of a GW. We work in the geometrical optics approximation, where the wavelength of the incoming GW, , is much shorter than the Schwarzschild radius of the lens mass. At longer wavelengths, the diffraction effect becomes dominant and the geometrical optics limit treat- ment of the GW lensing behaviour will become invalid. Throughout this study, we will be working in geometrized units (i.e. G = c = 1). We also make use of the thin lens approximation, where we assume that the GWs are scat- tered by a lens with its mass distributed across a two-dimensional plane perpendicular to the line of sight of the observer from the lens mass. In general terms, we can write the relation between the lensed GW, h (f ), and the unlensed GW, h(f ) in fre- lens quency domain such that h (f ) = F (f )h(f ). In this relation, the coefficient F (f ) is called the amplification lens factor and under the geometrical optics limit is defined as 1=2 2if t in d;j j F (f ) = j j e ; Equation 1. j where is the magnification factor of the j-th image of the lensed GWs, t is the time delay of the j-th image, j d;j and n is a discrete number which takes different values depending on the lens model. For a given lens model, the magnifications and time-delays will differ. In particular, the amplification factor is found by first using the lens surface mass density of a given lensing model to compute a deflection potential for the lens. Finding the solution to an integral related to the deflection poten- tial and various other source parameters would then yield an expression for F (f ). The amplification factor is de- termined not only by the frequency of the GW, but also by the lens position and its mass distribution. Point-Mass Lens Model For a point mass lens model, the lens mass is essentially defined as a two-dimensional Dirac-delta function on the thin lens plane. Under this model, two images are detected by the observer, and Equation 1 reduces to 1=2 1=2 2if t F (f ) = j j ij j e : Equation 2. By considering the red-shifted lens mass M = M (1 + z), we can define the magnification factors of the two in- Lz L h i (y +2) y +y terfering GW images as = and the time delay of the images as t = 4M + ln where d Lz 2 (2y ) 2 y = y + 4. y is a value which parameterizes the relative displacements of the source position, observer, and lens position. This is given by "D y = ; Equation 3. 0 S where " is the distance of the source from the line of sight, D is the distance from the lens to the observer, D is L S the distance from the source to the observer, and is the Einstein radius of the lens. Singular Isothermal Sphere Model (SIS) Unlike the point mass lens, the singular isothermal sphere (SIS) model assumes that the lens mass is circularly and symmetrically distributed across the thin lens plane. It is generally used to model extended objects such as large stars or galaxy clusters. The SIS model is usually characterized by a velocity dispersion v, which directly relates 8, 9 to the mass distribution of the model. This consequently gives rise to different expressions for the magnification factors and time delay of the interfering GW signals. It should be noted that under the SIS model, there is no sec- ond image detected by the observer if the lensing happens outside of the Einstein radius. Hence, the amplification factor for the SIS model is adjusted to become 1=2 1=2 2if t j j ij j e if y < 1; F (f ) = Equation 4. 1=2 j j if y 1 , where = 1=y 1 and t = 8M y. In this case, M is defined as the mass inside the Einstein radius of the d Lz Lz lens mass. The expression for the y parameter in the SIS model is equivalent to that of the point mass lens model, given by Equation 3. Machine Learning Algorithms We select three supervised classifiers which vary notably in their underlying algorithms for classification and pre- diction to find the most optimal method for analyzing spectrograms. We choose to utilize the support vector clas- sifier (SVC), the random forest classifier (RFC) and the multi-layer perceptron classifier (MLP). Below we give a general overview of how each algorithms works in order to highlight the significant difference in their method of classification which contributed to them being utilized. Furthermore, we do not focus on the mathematical formulations of each classifier as they are not pertinent to the overall aim of this research. Readers interested in the mathematics of these algorithms are directed towards the references mentioned below. Support Vector Classifier (SVC) SVC is based on the the support vector machine algorithm. Each item in our training set is plotted as a point in a n-dimensional space where n is the number of features representing an item and each feature is a coordinate of the item in the n-dimensional space. The algorithm calculates a hyper-plane which divides the two classes of spectro- grams and acts as a decision boundary. When we introduce the training set to the classifier, it predicts if a gravi- tational wave in a spectrogram is lensed based on where it is in the n-dimensional space with respect to the hyper- 13, 24 plane. The two most important hyper-parameters that we consider are C and
.
determines how the distance of each item in the training set influences the decision boundary. When
is large, the items closer to the boundary carry a larger weight and influence it heavily while the ones further away do not have much influence on its shape. On the other hand, for a small
, items nearer to the boundary are given less influence as the ones further away are also given importance. C is responsible for determining the cost of a smooth decision boundary against the cost of misclassification of training points. If C is large, the algorithm focuses on generating a decision boundary which classifies all items in the training set correctly. A small C prioritizes the smoothness of the decision function and allows for a softer boundary such that there are some items in training set that cross and overlap the decision boundary into the other class. Random Forest Classifier (RFC) The RFC is based on decision tree algorithms. A decision tree consists of a root node, which would be the training set, which splits into decision nodes based on discriminating features found in the data by the algorithm. These sub-nodes will continue to split until only the terminal node remains which would consist of a homogeneous piece of the original data and thus, cannot be split any further. When a new test item is introduced to the decision tree, it will classify the item according to which terminal node the data is matched to. In a RFC, multiple decision trees are grown. When making a prediction on a item in the test set, the most popular classification given by all the tree is used. This reduces the chance of error as multiple random decision tree are used rather than just one abso- lute one. For this classifier, we consider two hyper-parameters. The first is the number of trees in the forest (N ). More trees lead to better predictions but also utilizes more computational power. Thus, tuning it is required to avoid us- ing unnecessary computational resources. The second is the minimum number of samples required to allow a node to split (N ). This is an important factor in controlling over-fitting as a small value could lead to the algorithm mss selecting highly specific features only occurring in a small set of one class and thus, do not generalize well over the whole class. A large value would then quite clearly lead to under-fitting. Multi-layer Perceptron Classifier (MLP) The multi-layer perceptron classifier is a feed-forward neural network. The neural network consists of multi-layers as suggested by its name. The first layer is the input layer which consists of neurons matching the number of fea- tures of the input data. After the input layer, there are a number of hidden layers. In each layer, each neuron per- forms a linear weighted summation and then, a bias is added to avoid zero values. A non-linear activation function is applied to this summation and forwarded to the neurons in the next layer where this is repeated and continues until the outer layer is reached. At the outer layer, the output is given. However, the output often does not match the actual expected output. Thus, the backpropagation is applied to fix this. Backpropagation aims at minimizing the loss function using gradient descent. The loss function calculates the numerical difference between the actual class and the predicted class from the network. The loss is then minimized by adjusting the weights and bias using gradient descent which partially differentiates the loss function with respect to all the weights and bias. Each rep- etition of feed-forwarding and back-propagation is called an epoch. The MLP algorithm repeats this for a certain number of epochs until efficiency is achieved. Although the MLP classifier has a number of hyper-parameters which can be optimized, we only consider the num- ber of layers and the number of neurons in each layer (N ) and the L2 penalty term . is used to layer_neurons regularize over-fitting by adjusting the size of the weights. A larger leads to smaller weights and creates a smoother decision boundary which reduced over-fitting. On the other hand, a larger causes larger weights and a more complex decision boundary. The rectified linear unit (relu) function is used as the activation function. METHODS AND PROCEDURES Gravitational Wave Model The unlensed waveform that we consider is from a binary inspiral source. To simulate the lensed GW signal de- tected by the observer, we inject two GW strains h(t) with a given amplification factor and time delay depend- ing on the lensing parameters and the model used (point mass lens or SIS). In particular, we generate a wave- form computed up to 0:5 post-Newtonian (PN) order. This approximation should be sufficient for the purposes of this study as we are simply investigating whether the machine learning classifier is able to differentiate between a lensed and unlensed waveform. For this reason, we also choose to omit the post-merger waveform from our model. We use the following expression for the GW waveform 2i'(t) h(t) = 8 e x(t); Equation 5. 5 D 1=4 5=8 where is the reduced mass of the binary inspiral source. To 0:5 PN order, x(t) = (t) =4 and '(t) = (t) = are the post-Newtonian parameter and orbital phase respectively. (t) = t=5M is a surrogate time variable, where M is the total mass of the binary system and = =M is the symmetric mass ratio. Parameter M D D m ;m " z L L LS 1 2 Range 10 10 M 10 1000 Mpc 10 1000 Mpc 4 35M 0 0:5 pc 0 2 Table 1. Ranges of randomized parameters of the GW waveforms. m and m are the masses of the binary source, D is the dis- 1 2 LS tance between the source and lens mass, and z is the redshift parameter. The source and lens masses are sampled from a logarithmic distribution to reduce the bias towards more heavily lensed waveforms being generated. The resultant waveform is shown in Figure 2. A beating effect is clearly evident due to the two interfering wave- forms arriving at the observer with a time delay between them. In our model, the merger of the two component masses in the binary system occurs at t = 0 , which can be seen by the peak in the GW waveform. Unlensed wave- forms are simply generated by injecting a single GW strain instead of two. We randomize the parameters of the lensed and unlensed GW waveform within the ranges presented in Table 1. We also implement Gaussian noise with an amplitude of the order 10 to test the ability of the classifier to identify a lensed waveform within a noisy background signal. Signal-to-Noise Ratio (SNR) The signal-to-noise ratio (SNR) is a measure of the magnitude of a GW signal in relation to the background noise detected. For a given GW waveform h(f ), we calculate the SNR using jh(f )j SNR = 4 df Equation 6. S (f ) 0 n − 21 − 21 × 10 × 10 3 3 Detector noise Detector noise GW waveform GW waveform 2 2 1 1 0 0 − 1 − 1 − 2 − 2 − 3 − 3 400 400 300 300 200 200 100 100 0 0 − 10 − 8 − 6 − 4 − 2 0 − 10 − 8 − 6 − 4 − 2 0 Time [s] Time [s] (a) Lensed Gravitational Wave (b) Unlensed Gravitational Wave Figure 2. Top left: A lensed GW waveform under the point mass lens model (with noise, colored in orange) generated from a binary inspiral source.The two interfering GWs that form the resultant lensed waveform are detected by the observer at different times, caus- ing a beating effect to be seen. Bottom left: A spectrogram of the incoming lensed signal is generated by performing a short-time Fourier transform on the signal. Coalescence of the binary system occurs at t = 0, as seen from the increase in frequency of the GW over time. The same beating effect due to the lensing of the waveform can be observed. The spectrogram images are then used as the training data for the classifier. Top right: An unlensed GW waveform with noise. The amplitude of the signal is smaller due to the lack of any lensing magnification. Bottom right: A spectrogram of the unlensed signal. No beating effect is seen, and the smaller waveform leads to the noise being a more prominent feature in the spectrogram. where S (f ) is the power spectral density of the noise profile in the signal. The SNR from the recent six GW de- 1–6 tections ranged between 13 to 32:4. In our study, we choose to limit the SNR of the generated data to be less than 80. This is to ensure that the data we use for our MLCs are physically valid, and that the generated wave- forms do not dominate the noise in the signal. Spectrogram Since the GW waveform is generated in the time domain, we perform a short-time Fourier transform (STFT) on the data to extract the frequency information of the signal as a function of time. We then create a spectrogram of the incoming signal with the Gaussian noise added, as shown in Figure 2. Depending on the lensing parame- ters used, the beating effect due to gravitational lensing (seen in Figure 2) varies and is not always distinct even though the waveform is lensed. We prepare 2000 spectrogram samples of lensed GWs (1000 each for the point mass and SIS lensing model) and 1000 samples of unlensed GWs, which are used in the training and testing pro- cesses for the classifier. Using Equation 6, we find that our overall data has SNR values with a mean of 41 and a standard deviation of 19. Frequency [Hz] Strain Frequency [Hz] Strain Optimization of hyper-parameters Before we train our classifiers, we perform a grid search with cross-validation to determine the optimal combina- tion of hyper-parameters for each classifier. The result of the grid search are presented in Table 2. Classifiers Hyper-parameters Value C 1000 SVC 5 10 N 1000 RFC N 2 mss 0:72 MLP N 1000 layer_neurons Table 2. Optimal hyper-parameters for each classifier which are determined using a grid search with cross-validation. The hyper- parameters which gave the best score were chosen. Applying the hyper-parameters stated in Table 2, we train each classifier. Then, we use the trained classifier to predict the classes of the spectrograms in the test. Evaluation of MLC performance We choose to assign 75% of our spectrograms as the training set and retain the rest as the test set. We perform two tests for each MLC, each time using different sets of spectrograms: first using the unlensed and point-mass- lensed spectrograms, then using the unlensed SIS-lensed spectrograms. This results in two sets of data, one for each lensing model. We analyze the results from the classifiers using the classification report and the receiver op- erating characteristic curve. The classification report provides several figure-of-merits (FOM): the precision, P, recall, R, and the F score of the respective classifiers. These are defined as TP P = Equation 7. TP + FP TP R = Equation 8. TP + FN F = ; Equation 9. 1=P + 1=R where TP is the number of true positives (i.e. correctly classified lensed waves), FP is the number of false positives (i.e. unlensed waves misclassified as lensed waves) and FN is the number of false negatives (i.e. lensed waves mis- classified as unlensed). Precision is a measure of the accuracy of the positive predictions, whereas recall measures the sensitivity of the classifier in identifying lensed signals. Precision and recall can be merged into a single metric called the F score which is their harmonic mean, as shown in Equation 9. The receiver operating characteristic (ROC) curve is also used to find the most optimal classifier. The ROC curve plots the true positive rate against the false positive rate. The true positive rate is the same as recall while the false positive rate is the ratio of spectrograms were classified as unlensed but were lensed. The ROC curve pro- vides us with the information whether the recall can be increased while keeping the false positive rate low. A theo- retically perfect classification would allow us to increase the true positive rate to 1 while maintaining the false pos- itive rate at 0. This also means the larger the area under the curve (AUC), the better the classifier is. By referring to Figure 4, it would be clearer what this means. RESULTS After the training and testing each classifier, the output data that we obtain includes a confusion matrix, a classifi- cation report, and an ROC curve for both the point-mass lens and SIS models. The number of true positives, true Lens Models Point-mass Lens SIS FOM Precision Recall F score Precision Recall F score 1 1 SVC 0:980 1:000 0:990 0:992 1:000 0:996 RFC 0:917 0:796 0:852 0:943 0:864 0:902 MLP 0:951 1:000 0:975 0:961 0:992 0:976 Table 3. Classification report for all three MLCs, including all three figure-of-metrics (FOM). The SVC seemed to perform the best in identifying lensed signals under both lensing models, with F scores of 0:990 and 0:996, while the MLP classifier performed slightly worse, with F scores of 0:951 and 0:975. The performance of the RFC suffered due to low recall ratios of 0:796 and 0:864. negatives, false positives, and false negatives of each classification test are represented in a confusion matrix. This information is then presented in a bar chart (Figure 3) to allow us to compare the performances of each classifier. As mentioned previously, we use 25% of our data to test the MLCs, meaning that the testing data for each lens- ing model includes 250 unlensed samples and 250 lensed samples. In can be seen that the SVC had a false negative rate of zero when classifying data under both lensing models, as it was able to correctly identify all lensed sam- ples in both lensing cases. The RFC seemed to be able to identify unlensed samples to a good degree of accuracy, but was generally weaker in classifying lensed signals as seen from its relatively high false negative and low true positive rates. The MLP classifier performed to a similar degree of accuracy to the SVC, but mistakenly classified two lensed spectrograms as unlensed under the SIS model. Using a larger data set and conducting a longer grid- search to identify the optimal hyper-parameters for each classifier may allow us to further investigate each of their strengths and shortcomings. Point-mass True Positive SIS True Positive Point-mass True Negative SIS True Negative Point mass False Negative SIS False Negative Point mass False Positive SIS False Positive SVC SVC RFC RFC MLP MLP 0 50 100 150 200 250 0 50 100 150 200 250 Number of samples Number of samples (a) Lensed data (b) Unlensed data Figure 3. Bar charts showing the performance of each MLC in classifying lensed and unlensed waveforms. The charts show the number of correctly classified samples against incorrectly classified samples for data from each lensing model. The RFC seems to be the most prone in misclassifying both lensed and unlensed waves, as seen from its high false positive and false negative rates for both the point mass and SIS model. The SVC and MLP classifier performed much better, with the SVC having a near overall perfect classification accuracy when classifying data from both lensing models. The classification report gives the precision, recall, and F scores of all three classifiers after fitting, as shown in Table 3. The MLC with the best performance seems to be the SVC, as it had F scores of 0:990 and 0:996 when classifying lensed GW signals under the point-mass lens model and SIS model respectively. The RFC had lower F scores as a result of its weaker recall ratios of 0:796 and 0:864. This was most likely due to its higher false negative rate meaning that the RFC was more likely to classify a lensed signal as unlensed. Again, the high F scores of 0:975 and 0:976 for the MLP classifier show that it was able to identify both lensed and unlensed signals to a high degree of accuracy. Logarithmic Receiver Operating Characteristic Curve Logarithmic Receiver Operating Characteristic Curve 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 SVC SVC Random Forest Random Forest 0.2 0.2 MLP MLP Line of No Disciminiation Line of No Disciminiation 0.0 0.0 − 3 − 2 − 1 0 − 3 − 2 − 1 0 10 10 10 10 10 10 10 10 False Positive Rate False Positive Rate (a) Point-Mass Lens Model (b) SIS Model Figure 4. ROC curves of the three classifiers for both the point-mass lens and SIS lensing model. Area under the curve (AUC) Point-mass Lens SIS SVC 0:995 0:999 RFC 0:962 0:976 MLP 0:993 0:993 Table 4. The area under the ROC curve for the three classifiers in the two lensing models. The closer the area is to 1, the better the classifier is for classifying the spectrograms. The ROC curve for the SVC shows that we are able to increase the true positive rate (TPR) nearly to 1 while maintaining to a false positive rate (FPR) close to 0 for both lensing models. Similarly, the MLP classifier pro- vides a relatively high TPR and low FTP. However, for the RFC, reducing the FTP leads to a TPR below 0:6 for the point-mass lens model and around 0:8 for the SIS model. This indicates that even if we increase the recall of the classifier, the number of misclassifications would increase. It can be seen that out of the three MLCs we tested, the SVC is the most accurate in terms of identifying a lensed GW signal. It consistently has the highest precision, recall and F score in both the point-mass lens and SIS model, with scores of over 0:980 in all three metrics. Figure 3 indicates that the SVC is more likely to misclassify an un- lensed signal as lensed than misclassify a lensed signal as unlensed, although this result is statistically insignificant due to the limited data samples and the fact that the number of SVC misclassifications is still comparatively low. On the other hand, the RFC has a higher chance of incorrectly predicting lensed GWs as unlensed. The ROC curve for both models also indicates that SVC is the most appropriate classifier for our spectrograms for both model due to the high TPR with a FPR with the highest AUC while the RFC performs the worst because even if we were to increase the recall of the classifier, the number of misclassifications would increase. This means it has poor classification ability which is also reflected in the lowest AUC in both models. This further implies that the SVC is the best classifier tested. It should be noted that employing a grid-search and training the MLC takes significantly longer on the SVC, while it is the quickest on the RFC due to the nature of the algorithm itself and the fact that we could implement paral- lel processing when using the RFC. Overall, all three MLCs were able to correctly differentiate between lensed and unlensed GW signals, albeit with varying degrees of accuracy. True Positive Rate True Positive Rate DISCUSSION The use of machine learning classifiers is shown to be viable for identifying lensed GW signals, and can be seen as an alternative to the traditional matched filtering techniques. However, there are many steps we could take to further improve our study and make our findings more rigorous. One option is to use stricter assumptions in the generation of the GW signal. As we are currently using 0:5 PN order waveforms, we could instead consider using higher PN order waveforms provided by the PyCBC library to improve the accuracy of our GW waveforms. Furthermore, an alternative to using pure Gaussian noise as the sig- nal background is to use the noise power spectral densities provided by LIGO. This would allow us to generate more realistic noise waveforms, as the characteristic noise profile of the LIGO detector has greater contributions at 26, 27 low and high frequencies, due to seismic noise and photon shot noise respectively. Additionally, the SNR of our generated waveforms has a mean of 41, which is arguably high when compared to the recent GW detections. The binary source and lensing parameters may be fine-tuned in the future to reflect a more physically realistic GW source and lens mass, and ultimately produce GW signals with a lower SNR. By reducing the the SNR values of the data to approximately 30 or below, we will be able to test the limits of the MLCs’ capability to identify lensed signals when the GW waveform is overwhelmed by noise. Due to the nature of grid-search processes and the complexity of MLC algorithms, computational and time limita- tions restricted our ability to use a larger dataset in this study. It may be argued that the sample size we used for the training and testing processes for each classifier was too small to generate statistically significant results. We are considering ways to implement a larger dataset of spectrograms with MLCs and reducing the size of the data such as utilizing principal component analysis and t-distributed stochastic neighbor embedding while maintain- ing a high accuracy of classification. Another point of consideration is that we only completed a short grid-search for our hyperparameter selection process for each MLC. If we had performed a more rigorous and exhaustive grid- search, more optimal hyperparameters could be determined and we would be able to further investigate the extent of the MLCs’ capabilities. The implementation of alternative lensing models and MLCs to further test the validity of using machine learning to identify lensed waveforms may also be considered in the future. A potential idea for future investigations may be to use machine learning to not only identify lensed signals, but to predict the properties of the GW source and lens mass associated with a given lensed GW signal through parameter estimation by employing regression algo- rithms in machine learning. CONCLUSIONS In summary, machine learning classifiers were able to identify lensed GW signals (with mean SNR = 41) to a rel- atively high degree of accuracy. This suggests that MLCs can be considered as a viable alternative to matched fil- tering techniques in the search for lensing events. Further investigations will be required to test the validity and reliability of the use of machine learning in classifying lensed GW signals, and to understand the limitations of this approach to a greater degree. If subsequent research supports the validity of this method, a possible facet to ex- plore in the future would be to use machine learning on a given GW signal to study and extract useful information regarding the associated GW source and lens mass. ACKNOWLEDGEMENTS This project and the students were supported by the 2018 Summer Research Program of the Department of Physics, The Chinese University of Hong Kong and was also partially supported by a grant from the Research Grants Coun- cil of the Hong Kong (Project No. CUHK 14310816 and CUHK 24304317) and the Direct Grant for Research from the Research Committee of the Chinese University of Hong Kong. REFERENCES 1. B. P. Abbott et al. Observation of gravitational waves from a binary black hole merger. Phys. Rev. Lett., 116:061102, Feb 2016. 2. B. P. Abbott et al. GW151226: Observation of gravitational waves from a 22-solar-mass binary black hole coalescence. Phys. Rev. Lett., 116:241103, Jun 2016. 3. B. P. Abbott et al. GW170104: Observation of a 50-solar-mass binary black hole coalescence at redshift 0.2. Phys. Rev. Lett., 118:221101, Jun 2017. 4. B. P. Abbott et al. GW170608: Observation of a 19 solar-mass binary black hole coalescence. Astrophys. J. Lett., 851(2):L35, 2017. 5. B. P. Abbott et al. GW170814: A three-detector observation of gravitational waves from a binary black hole coalescence. Phys. Rev. Lett., 119:141101, Oct 2017. 6. B. P. Abbott et al. GW170817: Observation of gravitational waves from a binary neutron star inspiral. Phys. Rev. Lett., 119:161101, Oct 2017. 7. T. Treu and R. S. Ellis. Gravitational lensing - einstein’s unfinished symphony. Contemporary Physics, 56, 12 8. R. Narayan and M. Bartelmann. Lectures on gravitational lensing. In 13th Jerusalem Winter School in Theo- retical Physics: Formation of Structure in the Universe Jerusalem, Israel, 27 December 1995 - 5 January 1996, 9. Ryuichi Takahashi and Takashi Nakamura. Wave effects in the gravitational lensing of gravitational waves from chirping binaries. Astrophys. J., 595(2):1039, 2003. 10. B. J. Owen and B. S. Sathyaprakash. Matched filtering of gravitational waves from inspiraling compact bina- ries: Computational cost and template placement. Phys. Rev. D, 60:022002, Jun 1999. 11. C. Cutler and E. E. Flanagan. Gravitational waves from merging compact binaries: How accurately can one extract the binary‘s parameters from the inspiral waveform? Phys. Rev. D, 49:2658–2697, Mar 1994. 12. H.S. Cho. Search and parameter estimate in gravitational wave data analysis and the fisher matrix. Journal of the Korean Physical Society, 66(10):1637–1641, May 2015. 13. Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, Sep 1995. 14. Leo Breiman. Random forests. Machine Learning, 45(1):5–32, 2001. and references therein. 15. Christopher Bishop. Pattern Recognition and Machine Learning. Springer-Verlag New York, United States of America, 2006. 16. N. Chinchor. Muc-4 evaluation metrics. In Proceedings of the 4th Conference on Message Understanding, MUC4 ’92, pages 22–29, Stroudsburg, PA, USA, 1992. Association for Computational Linguistics. 17. Hans C. Ohanian. On the focusing of gravitational radiation. Int. J. Theor. Phys., 9(6):425–437, Jun 1974. 18. P. V. Bliokh and A. A. Minakov. Diffraction of light and lens effect of the stellar gravitation field. Ap&SS, 34:L7–L9, may 1975. 19. R. J. Bontz and M. P. Haugan. A diffraction limit on the gravitational lens effect. Ap&SS, 78:199–210, aug 20. K. S. Thorne. The theory of gravitational radiation: An introductory review. In N. Deruelle and T. Piran, editors, Gravitational Radiation, Amsterdam: North Holland, 1983. 21. S. Deguchi and W. D. Watson. Diffraction in gravitational lensing for compact objects of low mass. Astrophys. J., 307:30–37, August 1986. 22. P. Schneider, J. Ehlers, and E. E. Falco. Gravitational Lenses. Springer, New York, 1992. 23. T. T. Nakamura and S. Deguchi. Wave Optics in Gravitational Lensing. Prog. Theor. Phys. Suppl., 133:137– 153, 1999. 24. Aurélien Géron. Hands-On Machine Learning with Scikit-Learn and TensorFlow. O’Reilly Media, United States of America, 2007. 25. J. D. E. Creighton and W. G Anderson. Gravitational-wave physics and astronomy: An introduction to the- ory, experiment and data analysis. Wiley, Weinheim, 2011. 26. A. Abramovici et al. LIGO: The laser interferometer gravitational-wave observatory. Science, 256(5055):325– 333, 1992. 27. B. P. Abbott et al. LIGO: The laser interferometer gravitational-wave observatory. Rept. Prog. Phys., 72:076901, 2009. ABOUT THE STUDENT AUTHORS At the time of writing, Amit Jit Singh began his second year of undergraduate study at the Chinese University of Hong Kong where he is undertaking a BSc in Physics with plans to minor in Computer Science. He intends to attain a Ph.D. in Physics with a focus on data science. Ivan S. C. Li is currently a third year undergraduate student at Imperial College London, where he is studying for an MSci in Physics. He undertook this research project at the Chinese University of Hong Kong as a summer intern. PRESS SUMMARY With the recent gravitational wave detections in the past few years, plenty of research has been focused on meth- ods of examining and analyzing the waveforms to extract useful data. An interesting phenomenon that physicists study is gravitational lensing, which is where gravitational or electromagnetic wave signals can be lensed when they pass by massive astronomical objects. If we are able to identify gravitationally lensed signals, we may be able to learn more about the gravitational wave source and the massive object causing the lensing effect. Traditional methods of identifying lensed signals require highly precise and sophisticated lensing models, which we currently do not have. Our research aims to show that machine learning techniques pose as a possible alternative for per- forming this task. Through our study, we demonstrate that machine learning classifiers are able to differentiate between gravitationally lensed and unlensed gravitational-wave detections to a relatively high degree of accuracy, which suggests that machine learning techniques should be more frequently considered in gravitational wave anal- ysis. However, further research is required to determine the extent at which machine learning is able to accurately extract useful information regarding the physical parameters of the gravitational wave source and lensing object.
http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.pngGeneral Relativity and Quantum CosmologyarXiv (Cornell University)http://www.deepdyve.com/lp/arxiv-cornell-university/classifying-lensed-gravitational-waves-in-the-geometrical-optics-limit-vCxdr4C979
Classifying Lensed Gravitational Waves in the Geometrical Optics Limit with Machine Learning