Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Deep Learning Reveals Underlying Physics of Light-matter Interactions in Nanophotonic Devices

Deep Learning Reveals Underlying Physics of Light-matter Interactions in Nanophotonic Devices Deep Learning Reveals Underlying Physics of Light-matter Interactions in Nanophotonic Devices 1,a 1,a 1 Yashar Kiarashinejad , Sajjad Abdollahramezani , Mohammadreza Zandehshahvar , 1 1,b Omid Hemmatyar , and Ali Adibi School of Electrical and Computer Engineering, Georgia Institute of Technology, 778 Atlantic Drive NW, Atlanta, GA 30332, USA These authors contributed equally to this work ali.adibi@ece.gatech.edu ABSTRACT In this paper, we present a deep learning-based (DL-based) algorithm, as a purely mathematical platform, for providing intuitive understanding of the properties of electromagnetic (EM) wave-matter interaction in nanostructures. This approach is based on using the dimensionality reduction (DR) technique to significantly reduce the dimensionality of a generic EM wave-matter interaction problem without imposing significant error. Such an approach implicitly provides useful information about the role of different features (or design parameters such as geometry) of the nanostructure in its response functionality. To demonstrate the practical capabilities of this DL-based technique, we apply it to a reconfigurable optical metadevice enabling dual-band and triple-band optical absorption in the telecommunication window. Combination of the proposed approach with existing commercialized full-wave simulation tools offers a powerful toolkit to extract basic mechanisms of wave-matter interaction in complex EM devices and facilitate the design and optimization of nanostructures for a large range of applications including imaging, spectroscopy, and signal processing. It is worth to mention that the demonstrated approach is general and can be used in a large range of problems as long as enough training data can be provided. Keywords: deep learning, physical understanding, dimensionality reduction, nanophotonics, metamaterials, plasmonics 1 Introduction To manipulate the inherent properties (e.g., amplitude, phase, polarization, and frequency) of electromagnetic (EM) waves in the subwavelength regime, nanophotonic structures (especially metamaterials and metasurfaces) have emerged as a promising 1–4 candidate . The development of reliable fabrication techniques to realize such nanostructures has opened up new opportunities for forming reliable flat optical components to replace the existing bulky optical elements. Numerous interesting functionalities 5 6–12 13 have been demonstrated so far including planar lenses , calculus metasurfaces , meta-holograms , and nonlinear meta- 14, 15 modulators . However, systematic realization of mature optical functionalities using complex nanostructures requires significant knowledge about the influence of nanostructure features on the interaction of EM waves, which currently can only be found using cumbersome numerical calculations. Despite extensive efforts in forming new approaches for design and 9 16 17, 18 optimization of nanostructures (e.g., using brute-force techniques , evolutionary techniques like genetic algorithms or 19 ?, 20–27 28 particle swarm optimization , semi-analytical modeling , pattern recognition method , and even neural network-based ?, 29, 29–35, 35–40 techniques ), systematic approaches for understanding the physics of wave-matter interaction in nanostructures and/or the effect of structural properties on their output response are still missing. On the other hand, available design and optimization approaches (e.g., the brute-force techniques) either suffer from significant computation complexity or over-simplify the problem in both response and design domains (e.g., due to sever down-sampling). Such computation complexity and oversimplification hinders the use of the existing design tools to provide a detailed understanding of the dynamics of light-matter interaction inside nanostructures unless a large set of simulations is performed. Here, we present a new design and optimization approach based on deep learning (DL) that provides detailed information about the role of design parameters in the output response of any nanostructure as well as intuitive understanding of the physics of wave-matter interaction in these nanostructures without imposing stringent computation complexity. Our approach is based on reducing the dimensionality of the problem in both design and response spaces while preserving the vital information. By training proper neural networks (NNs) for implementation of these dimensionality reduction (DR) processes, we find a complex analytic formula that relates the design parameters to the output response of the nanostructure. Despite their inherent complexity, such analytic relationships provide valuable intuitive information about the role of each design parameter in the overall response of the nanostructure at minimal computation costs. This is in contrast to existing design and optimization techniques, which arXiv:1905.06889v1 [physics.optics] 7 May 2019 Set of Random Design Parameters Reducing Reducing EM Full-Wave Dimensionality of Dimensionality of Simulation Response Space Design Space Set of Random Responses Reduced Design Parameters Response Space Extracting Underline Physics Figure 1. Procedure of revealing the underlying physics of a generic nanostructure using the DR algorithm. First, a set of random values is fed into the full-wave EM solver as design parameters. The output response (e.g., reflection spectra) of the corresponding simulated structures are lunched to a DR algorithm forming the reduced response space. In the next step, the reduced response-space data points and the associated random design parameters are used for training of a pseudo-encoder for DR of the design space. Finally, the trained pseudo-encoder provides physical intuition of the investigated EM structure and the role of each design parameter on the overall output response. require costly iterative computations for each design problem without providing intuitive understanding about the role of design 41–43 parameters . This approach also allows for trading off the computation accuracy and complexity. Thus, it can be used for obtaining quick high-level information about the physics of light-matter interaction (e.g., the role of a given design parameter on the overall device performance) or running longer simulations to achieve more detailed information about a specific feature of the structure. By providing the required information for intuitive understanding of the light-matter interaction or design of a class of patterned nanostructures for any desired application with orders of magnitude less computation complexity, this approach can have a transformative impact on several applications that rely on nanostructures including imaging, spectroscopy, signal analysis, sensing, and LiDAR among others. The design and optimization approach and its important properties are discussed in Section 2. The results of the application of this technique to practical nanostructures is presented in Section 3. Understanding the underlying physics of investigated nanostructures as well as more detailed properties of this technique is discussed in Sections 4 and 5, and final conclusions are made in Section 6. 2 Deep learning-based Approach for Design and Optimization of Nanostructures Figure 1 shows the schematic representation of our platform for analysis, design, optimization, and understanding the physics of nanostructures. Our main focus in this paper is to use such a simulation platform to extract the underlying physics of light-matter interaction through running a complete design and optimization process. Our approach (depicted in Fig. 1) uses the high level of correlation of light-matter interaction in the spatial and spectral domains to considerably reduce the dimensionality of the response space of the problem. Furthermore, the correlation that often exists among the effect of structural design parameters on the response space is used to reduce the dimensionality of the design space. In the first step, a full-wave EM simulation software (based on the finite element method (FEM) implemented in the COMSOL environment, unless otherwise stated) is used to provide sufficient number of randomly generated instances (or so-called the input dataset) to train our DR approach. Each instance is calculated using a given set of randomly selected design parameters (i.e., a point in the design space), and thus, it relates the design space to the response space. After feeding a DR network using a subset of the available training data, we reduce the dimensionality of the response space. By setting the level of acceptable error, we find the minimum accessible dimensionality of the reduced response space. These reduced features are related to the intact dimensions in the original response space through the analytic formulas provided by the dimensionality expansion methods. 2/17 Encoder Decoder . . . . Output Input . . . . . . . Bottleneck Figure 2. Schematic representation of an autoencoder architecture used in the DR technique. The leftmost half part (i.e., the encoder) reduces the dimensionality (the bottleneck layer represents the reduced space) while the right part (i.e., the decoder) recovers the data from the reduces space back to the original space. x and x ˆ represent inputs and outputs, respectively. i i 44 45 46 We applied principal component analysis (PCA) , kernel PCA (KPCA) , and autoencoder to reduce the dimensionality of our response space (more details are provided in the Supplementary Information). PCA is a linear DR algorithm that projects the data points on the eigenvectors of the covariance matrix of the responses. During the projection, first d eigenvectors with the highest eigenvalues are selected, where d represents the dimensionality of the reduced response space . KPCA is the nonlinear version of PCA in which a kernel function maps datapoints using a nonlinear function and then projects the datapoints on the basis vectors of the covariance matrix of the kernelized space . As another DR method, the autoencoder (shown in Fig. 2) encodes the high-dimensional input on the leftmost part to low-dimensional data in the middle layer using a multilayer NN. The same NN can be used to decode and recover the data (back to the original response space) with some error . In other words, autoencoder is a feedforward NN in which it has same number of inputs and outputs. The number of neurons in the middle layer represents the dimension of the low-dimensional data represent the desired reduced dimensionality. This layer is also known as the bottleneck of the autoencoder. As shown in Section 3, our simulations show that the performance of the autoencoder surpasses those of the PCA and KPCA. After training an autoencoder for the DR of the response space (see Fig. 3(a)), we reduce the dimensionality of the design space using a pseudo-encoder architecture (see Fig. 3(b)) to relate the design and response spaces with minimal computation complexity. Since the input and output of the DR mechanism in Fig. 3(b) are different parameters (in contrast to the case for an autoencoder), we call this architecture a pseudo-encoder. In this manner, we directly include the information about the response of the nanostructure into the reduced design space, which is a major advantage of the pseudo-encoder architecture. Once the DR in both spaces are performed, we form a complete NN-based architecture that directly relates the design parameters to the nanostructure response by integrating the two trained NNs for the DR algorithms shown in Figs. 3(a) and 3(b). Once the underlying NNs are trained, we will obtain complex analytic formulas to study in details the roles of the design parameters in the output features. In addition, the weights of the NNs at different layers in for both the autoencoder and the pseudo-encoder can provide valuable information about the role of design parameters on the output response. 3 Analysis of Nanostructures Using the DR-based Technique To show the applicability of the proposed approach, we consider here two simple design problems for the implementation of a reconfigurable multifunctional metadevice enabling dual-band and triple-band absorption in the telecommunication window. Figure 4 shows the schematic of the supercell structure of the metadevice, which can be electrically tuned to obtain the desired reflection spectrum when illuminated with a TM-polarized light (i.e., magnetic field normal to the direction of grating). Considering the maximum sampling value for the periodicity, the supercell in this design can consist of up to two unit cells to effectively suppress the higher diffraction orders in the telecommunication window. Each unit cell is comprised of a gold (Au) nanoribbon incorporating germanium antimony telluride (GST), a well-developed phase-change alloy. Upon non-volatile conversion of GST from the amorphous to the full crystalline state, a drastic change happens in its refractive index, which consequently induces a remarkable change in the reflection response. Meanwhile, the intermediate phase transition of GST can be realized by exciting it with an external stimulus (e.g., an electrical current). The nanostruture in Fig. 4 has 7 design parameters, i.e., the widths of the two Au nanoribbons in the supercell (w and w ), unit cell periodicities ( p and p ), the 1 2 1 2 3/17 (a) Reduced Optical Response Optical Response Encoder Response Decoder Features Features Features (b) Reduced Reduced Optical Design Concatenated Neural Design Pseudo-encoder Response Parameters Layers Parameters Features Figure 3. Reducing dimensionality of response and design spaces using autoencoder and pseudo-encoder platforms, respec- tively. (a) Reducing the optical response feature and represent optical response in reduced response space. (b) Architecture of a pseudo-encoder, that maps the design parameters to the reduced design parameters. % % crystalline states of the two GST nanostripes (m and m in which the superscript number represents the crystallization 1 2 fraction), and the thickness of the GST nanostripes (h). It is notable that a specific crystallization fraction (which is associated to the refractive index of GST) can be realized by applying a predefined gate voltage. Here, we assume the thicknesses of the silicon dioxide (SiO ) layer and Au nanoribbons are defined by the fabrication limitations, and thus, we do not consider it as a separate design parameter. The training of the DR algorithm for the response space is performed with 1700 instances obtained using the FEM simulations for structures with randomly selected design parameters. To obtain these instances, the reflection spectrum (i.e., reflection as a function of frequency) of the structure in Fig. 4 is calculated and sampled over the 150-300 THz range (with 3.75 THz spacing between adjacent samples) to obtain the response-space results. Thus, the number of parameters in the design space is 7, and the number of samples in the response space is 40. Details of these simulations are provided in Methods. In addition, we simulated 300 extra structures (with randomly selected design parameters) to obtain the validation dataset. We applied the three DR algorithms (discussed on Section 2) with different number of reduced dimensions to the training dataset and tested them with the (unseen) validation dataset. The mean squared error (MSE) for different number of dimensions (d) in the reduced response space for the three DR algorithms is represented in Fig. 5(a). In these simulations, the polynomial kernel with degree of 7 is selected for the KPCA method. The autoencoder (see Fig. 2) consists of 7 layers in total and the number of nodes in the hidden (or intermediate) layers are 40, 30, 20, d, 20, 30, and 40, respectively. Here d represents the dimension of the reduced response space, which is the number of nodes in the bottleneck of the autoencoder in Fig. 2. The activation function for all nodes is fixed to tangent sigmoid function. As it is shown in Fig. 5(a), the autoencoder outperforms PCA and KPCA for all values of d. This reveals the effectiveness of autoencoder in keeping nonlinear properties of the response space. KPCA works slightly better than PCA for lower dimensions (d); however, it has higher MSE as the dimensionality increases because of overfitting. Figure 6 represents the reconstructed spectra using different DR methods for three values of d (d = 2, 7, and 16 for Figs. 6(a), 6(b), and 6(c), respectively, with respective errors for different cases shown in Figs. 6(d), 6(e), and 6(f)). As seen from Fig. 6(b), the autoencoder is able to reconstruct the response spectrum after reducing the dimensionality of the response space from 40 to 7. Figure 5(a) also confirms that the autoencoder with d = 7 is a good choice for the DR of the response space with MSE < 0.05. In the next step, we train the pseudo-encoder with the training dataset and test it with the validation data set using the approach discussed in . To simplify the computation, we only consider one layer for the encoder part of the pseudo-encoder. Figure 5(b) shows the MSE as a function of the dimension in the reduced design space D. It is clear that by reducing the dimension of the design space from 7 to 4, MSE < 0.02 is achieved. Figure 7(a) shows the pseudo-encoder architecture with the design parameters and the reduced response space being its 4/17 Figure 4. Three-dimensional (3D) illustration of the hybrid plasmonic/phase-change material metasurface studied in this paper. The design parameters are the thickness of GST nanostripes (h), the crystallization fraction of the GST nanostripes (m and m ), the unit cell periodicities ( p and p ), and Au nanoribbon widths (w and w ) while the thickness of the Au nanoribbon is 1 2 1 2 fixed at 30 nm. Each supercell can consist of one or two unit cells each comprised of a GST nanostripe encapsulated between % % the Au nanoribbon and the Au back-reflector separated from each other by symmetric SiO spacers. m and m are dependent 1 2 on the gate voltages V and V , respectively. The whole structure is illuminated with a TM-polarized light at the near-infrared 1 2 frequency range. input and output, respectively. The weights of the first layer in pseudo-encoder represent the importance of different design parameters. In this manner, each input node (corresponding to each deign parameter) is connected to the nodes in the second layer with the strengths shown by the weights. Thus, design parameters with more significant roles have larger weights in this layer. Figure 7(b) shows the weights in the first layer of the pseudo-encoder. It is clear that the GST thickness h has the strongest role in the output response. Thus, the response of the structure is more sensitive to this parameter. Besides, w (i = 1; 2) and p (i = 1; 2) have almost similar accumulative intensities (or weights) and thus, similar influence on the response space. It is clear that the response space is slightly affected by the less important design parameter m (i = 1; 2). This understanding of the relative importance of the design parameters in the output response is very helpful for initializing any wise optimization process, even with traditional approaches. 4 Understanding the Physics of Light-matter Interaction in Nanostructures Herein, we illustrate that the interpretation of the weights of the pseudo-encoder can effectively reveal the underlying physics of light-matter interactions in nanostructures. For this purpose, we perform a comprehensive analysis of the fundamental modes of the metasurface in Fig. 4 using full-wave EM simulations in the given frequency range. Further information on the material properties as well as details of the FEM simulation process are provided in the Methods. Figure 8(a) shows upon excitation of a unit cell of the structure in Fig. 4 with a TM-polarized light, the Au nanoribbon/GST nanostripe and GST nanostripe/Au back-reflector can support short-range surface plasmons (SR-SPPs), which are EM waves bound to and travel along the metal-dielectric interface with short propagation lengths. It should be noted that the effective index contrast at the interfaces of Au nanoribbon end-faces and air implies that each nanoribbon approximately acts as a lateral mirror-like cavity. Accordingly, a constructive interference happens between SR-SPP modes travelling back and forth (i.e., in the x-direction in Fig. 4) between the two ends of the Au nanoribbon. Furthermore, the difference between the refractive indices of GST and SiO enhances the effect of the lateral Fabry-Perot cavity in the intermediate GST nanostripe at the interface between the GST and the (lower) Au back-reflector plane. Thus, a similar mode profile exists within that region, which can be ascribed to a confined constructive SPP. 5/17 −2 ·10 0.4 8 Autoencoder PCA Kernel PCA 0.3 0.2 0.1 1 2 3 4 5 6 7 2 4 6 8 10 12 14 16 18 Dimensionality Dimensionality (a) (b) Figure 5. (a) MSE for different DR algorithms on the response space. The autoencoder has 7 layers with dimensions 40, 30, 20, d, 20, 30, and 40, where d represents the number of reduced dimensions. The KPCA is trained with a polynomial kernel of degree 7. (b) MSE of the pseudo-encoder for different dimensions of the reduced design space. The number of nodes (or the dimension) of the different layers of the pseudo-encoder are 7, d, 10, 20, 20, 30, 30, 40, and 7 at each layer where d represents the dimensionality of the reduced design space. The two aforementioned SPPs are the fundamental modes of the metasurface defining any arbitrary response from the structure. We expect that the parameter simultaneously modifying the field profiles of these modes plays the key role in engineering the spectral response of the metasurface. The DR algorithm introduces h as the most influential parameter (see Fig. 7(b)). To justify such a claim, we study the effect of three different h values on the field profile of the metasurface for a fixed set of other parameters. Figure 8(b) shows that when the two Au-GST interfaces (at the top and bottom of the GST nanostripe) are far (i.e., large h), the fundamental modes are spatially separated in a unit cell. By decreasing the distance of these interfaces (i.e., h), the coupling between the SR-SPP and the confined SPP modes sustained by individual interfaces increases until these highly coupled modes form a supermode as the dominant mode of the structure (see Fig. 8(a)). Fig. 8(c) shows that further decrease in parameter h results in fading of one of the modes. To further verify our conclusion, we finely sweep h and present the reflection spectrum in Fig. 10(a). This figure illustrates that by changing h, no abrupt change happens in the reflection spectrum profile. Such a gradual transition verifies the absence of the well-known gap-surface-plasmon resonance, a highly confined magnetic mode, which originates from the circulating displacement current between a metal nanoribbon and a metal back-reflector. This can be ascribed to the remarkable refractive index of GST in any crystallization fraction. Thus, we firmly conclude the metasurface only supports the two above-mentioned SPP modes (and no gap-surface-plasmon mode). This makes h the most important design parameter with the maximum influence on the variation of the reflection spectrum. Figure 7(b), shows that w (i = 1; 2) and p (i = 1; 2) have the secondary dominant effects on the EM response after h. i i This is in-line with the well-known fact that the resonance frequency of a SPP mode is highly dependent on the width of the nanocavity. Moreover, we found that even by continuously increasing w, only the well-known odd order SPP modes could be excited (see Fig. 9) . This observation justifies that variation of w does not change the inherent nature of these individual SPP modes and keeps them rather decoupled. Figure 10(b) corroborates that while other parameters in a unit cell are fixed, decreasing (increasing) the width (w) reasonably blueshifts (redshifts) the resonance. However, the reflection response is not as sensitive to w as to h since the nature of both individual SPP modes remains intact while w is varied. More apparently, by 0% modifying the width around w = 350 nm while having other parameters fixed (i.e., h = 170 nm, p = 580 nm, and m ), the resonance profile in Fig. 10(b) has a broader linewidth compared to its counterpart in Fig. 10(a). This comparison well justifies the more sensitive nature of the reflection spectrum to h. On the role of p, it is notable that light diffraction from the surface of the structure is the origination of the confined SPP mode excited at the interface of the GST nanostripe/Au back-reflector. As a result, the behavior of the overall reflection spectrum relies on the periodicity (i.e., p) of each unit cell. Figure 10(c) illustrates that by increasing p, the lower portion of the incident light couples to the confined SPP mode, and some part of it reflects in the form of higher diffraction orders. On the other hand, decreasing the periodicity can reasonably increase the coupling of adjacent unit cells, which changes the overall reflection response. Finally, Fig. 7(b) shows that m (i = 1; 2) has the minimum effect on the EM response among the investigated parameters. 6/17 Mean Squared Error (MSE) Mean Squared Error (MSE) −1 Autoencoder Original Data PCA Autoencoder KPCA PCA −2 KPCA 10 0.8 0.6 −3 0.4 −4 0.2 −5 0 10 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency(THz) (a) (b) −1 Autoencoder Original Data PCA Autoencoder KPCA PCA −2 KPCA 10 0.8 0.6 −3 0.4 −4 0.2 −5 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency(THz) Frequency(THz) (c) (d) −1 Autoencoder Original Data PCA Autoencoder KPCA PCA −2 KPCA 10 0.8 0.6 −3 0.4 −4 0.2 −5 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency(THz) (e) (f) Figure 6. Reconstructed reflection amplitude (left) and reconstruction MSE versus frequency (right) for dimensionality reduction using PCA, KPCA, and autoencoder. Results for reducing the dimensionality from 40 to (a),(b) 2, (c),(d) 7, and (e),(f) 16. The hyper-parameters for KPCA and autoencoder are the same as parameters used in fig. 5. 7/17 Reflection Amplitude Reflection Amplitude Reflection Amplitude Mean Squared Error (MSE) Mean Squared Error (MSE) Mean Squared Error (MSE) ℎ (a) (b) Figure 7. Detailed architecture of the adopted pseudo-encoder. (a) The number of nodes of the pseudo-encoder are 7, 4, 10, 20, 20, 30, 30, 40, and 7 at each layer. (b) Weights of the first layer of the pseudo-encoder in (a) which is yellow highlighted. This is also seen from Fig. 10(d) as slightly changing m (while keeping all other parameters fixed) results in no major change in the reflection spectrum amplitude. Nevertheless, Fig. 10(d) suggests that the location of the minimum of the reflection % % spectrum (or the absorption peak) depends on m ; increasing (decreasing) m (i.e., larger GST refractive index) results in a red (blue) shift in the absorption peak. Accordingly, choosing a unit cell with a proper crystallization fraction (m ) ensures near-unity absorption at the desired frequency. More importantly, a structure with a supercell with two different values of m % % (m and m in Fig. 4) can have a multi-band absorption governed by the constructive and/or destructive overlap between the 1 2 distinct resonance peaks corresponding to the two values of m . This design approach can be extended to structures with more sophisticated supercells to form more complex absorption spectra. H E H E H E y x y x y x k k z z (a) (b) (c) Figure 8. EM field distributions in a unit cell of the structure in Fig. 4. Magnetic field (presented by the thermal colormap) and electric field profile (represented by arrows and coded by the rainbow colorbar) for a unit cell with (a) h = 150 nm, (b) h = 60% 250 nm, and (c) h = 50 nm, respectively. The other structural parameters are fixed as p = 550 nm, w = 340 nm, and m . The frequency of incident TM-polarized light is f = 194 THz. 5 Discussion The intuitive understanding of the role of design parameters on the response of a given nanostructure obtained in Section 4 can facilitate the design of more sophisticate nanostructures using any optimization technique. The computation complexity of any optimization approach depends heavily on the discretization of the values of different design parameters. Our approach suggests that maximum discretization shall be used for the most influential design parameter (e.g., h in the structure in Fig. 4) while a more sparse discretization is acceptable for less important design parameters (e.g., m (i = 1; 2) in the structure 8/17 H E H E H E y x y x y x k k z z (a) (b) (c) Figure 9. EM field distributions in a unit cell of the structure in Fig. 4. Magnetic field (presented by the thermal colormap) and electric field profile (represented by arrows and coded by the rainbow colorbar) for a unit cell with (a) w = 100 nm (first order mode), (b) w = 300 nm (third order mode), and (c) w = 500 nm (fifth order mode), respectively. The other structural 60% parameters are fixed as p = 550 nm, h = 150 nm, and m . The frequency of incident TM-polarized light is f = 194 THz. in Fig. 4). To test this approach, we used our findings in Section 4 about the structure in Fig. 4 to design a multifunctional metadevice providing dual-band absorption (at two distinct frequencies of f = 195 THz and f =235 THz) and triple-band 1 2 absorption (at three distinct frequencies of f = 170 THz, f = 205 THz, and f = 240 THz). For device optimization, we used 1 2 3 the exhaustive search of design parameters using the analytic formulas obtained by the trained system that relates the design space to the response space with DR. We also used non-uniform discretization of the design parameters through our findings in Section 4 to minimize the computation complexity. The optimized supercell offered by this approach has two unit cells 0% 50% 0% 0% with similar structural parameters h = 180 nm, p = 550 nm (i = 1; 2), and w = 340 nm (i = 1; 2). (m ,m ) and (m ,m ) i i 1 2 1 2 are obtained for the dual- and the triple-band absorption functionalities, respectively. Figure 11 shows the response of the % % designed multifunctional metadevice. As shown, by slightly changing m and m , the locations of the absorption peaks do 1 2 not considerably change around the optimized values. This is anticipated regarding the discussion in Section 4; the overall % % reflection response of the structure has minimum sensitivity to m and m and thus, it is robust against random variations of 1 2 external stimulus (here the gate voltage) or other destructive environmental effects (such as GST oxidization). Another important observation is that by retraining the algorithm using a different set of training data (for the same structure), the weights slightly change, but the trends of their variations remain the same. This means that the intuitive understanding of the roles of the designing parameters in its response is to a good degree independent of the training process. We repeated this process at least 20 times with different sets of training data for the structure in Fig. 4 and found the same conclusions in all trials. Nevertheless, we think that for sophisticated nanostructures with many design parameters, training the algorithm with different training sets may reveal different information about the device operation. This is especially valuable for the complex nanostructures in which simple simulations (e.g., like the ones that resulted in Fig. 11) cannot be used to understand the role of design parameters. It is also not practical to simulate enough structures to learn the role of a specific design parameter due to computation complexity of such complex nanostructures. It is also worth mentioning that the technique discussed here is not limited to nanostructures; it can be extended to cover many different problems (e.g., fluid mechanics, heat transfer, acoustic wave propagation, etc.) as long as enough training data can be provided. 6 Conclusion We demonstrated here a DL-based technique for the understanding of the physics of wave-matter interaction in nanostructures. By using the DR algorithm in the response space and the design space (using an autoencoder and a pseudo-encoder, respectively), we could obtain an analytic formula that relates the design parameters to the response of the nanostructure while providing access to the weights of the neural notworks at all layers. By analyzing these weights, important information about the roles of different design parameters in the overall response of the nanostructure can be obtained. This intuitive information can be used to understand the physics of light-matter interaction while facilitating the device optimization process by suggesting a non-uniform discretization of the design parameters to reduce the computation requirements. As such, the approach presented here can have an important impact in the design and understanding of the EM wave-matter interaction in nanostructures while being extendable to several other applications. 9/17 (a) (b) (c) (d) Figure 10. Reflection amplitude profile of a unit cell of the metasurface shown in Fig. 4 with different structural parameters under illumination of a TM-polarized light. Reflection amplitude versus (a) GST nanostripe thicknesses while geometrical 0% parameters are chosen p = 580 nm, w = 350 nm, and m , (b) Au nanoribbon widths while other geometrical parameters are 0% fixed as h = 170 nm, p = 580 nm, and m , (c) periodicity of the unit cell while other geometrical parameters are chosen h = 0% 170 nm, w = 350 nm, and m , and (d) crystallization fraction while geometrical parameters are fixed as h = 170 nm, p = 580 nm, and w = 350 nm. 10/17 (a) (b) Figure 11. Absorption spectra of the designed optimized multifunctional metadevice. (a) Spectra of the original dual-band absorber (solid blue line) at two distinct frequencies of f = 195 THz and f =235 THz, and (b) triple-band absorber (solid blue 1 2 line) at three distinct frequencies of f = 170 THz, f = 205 THz, and f = 240 THz. Red dashed curves justify that by slightly 1 2 3 modifying m (i = 1; 2) around the optimized values, the spectra change moderately. Methods All full-wave EM simulation results shown in the main text were obtained by using COMSOL Multiphysics 5.3, a commercial- ized full-wave simulation package based on the FEM. The proposed metasurface in Fig. 4 was simulated in a two-dimensional environment with periodic boundary conditions in the y direction. The structure is assumed infinite in the x direction and was excited with a TM-polarized plane-wave propagating in the +z direction. The optical properties of the amorphous and the fully crystalline GST were obtained from and those of the intermediate states were calculated using the well-known Lorentz-Lorenz relation formulated as : e ( f ) 1 e ( f ) 1 e ( f ) 1 e f f c a % % = (m =100) + ((m =100) 1) ; (1) e ( f ) + 2 e ( f ) + 2 e ( f ) + 2 e f f c a where for a specific frequency f , e ( f ) and e ( f ) are the permittivities of the fully crystalline and the amorphous GST, c a % 0% 100% respectively, and m , ranging from 0% (i.e., m or amorphous) to 100% (i.e., m , fully crystalline), is the crystallization fraction of GST. The optical properties of other materials were obtained from . 11/17 Supplementary Information S1 Principal Component Analysis (PCA) PCA is a linear dimensionality reduction method, which maps the data into a linear subspace so that the variance is maximized. In other words, PCA projects data points onto the eigenvectors of the covariance matrix of the data points with larger eigenvalues. PCA results in minimum MSE during reconstruction. Thus, it gives the best representation of the data in the lower-dimensional space in terms of MSE. Considering X as the response space matrix whose rows are samples of the reflection amplitude for an specific design, the first step in this algorithm is to centralize the dataset (i.e., subtract mean from data points): X = X X ; (S1) mean where X is the mean matrix of the data points, and X is the centralized matrix. The eigenvectors of the covariance matrix of mean X represent the principal components. These vectors are found using the singular value decomposition as: X = USV ; (S2) ˆ ˆ Here, columns of matrix U represent the basis vectors of the covariance matrix X X . In addition, S is a diagonal matrix, and its diagonal elements (s ) are the singular values associated with the columns of U . Therefore, by keeping the first k columns of U with the largest singular values, the projected matrix is: ˆ ˆ PX = U X; (S3) where U contains the first d columns of U and PX is the projection of the data in the lower-dimensional space. To reconstruct the projected response, we first make the projection inverse and then add the mean to the matrix as: RX = U PX + X ; (S4) d mean Finally, the reconstruction error is jjX RXjj ; (S5) å i i i=1 th where N is the number of data points, and X represents the i row of the response-space matrix. S2 Kernel PCA (KPCA) The nonlinear version of PCA is known as kernel PCA (KPCA). Figure S1 shows how PCA and KPCA project data points into the lower-dimensional space. As we discussed before, PCA projects the data points on a linear subspace. However, if there are nonlinear properties in the dataset, PCA might result in a poor performance. KPCA transforms the original data into a higher-dimensional space using a nonlinear mapping f(x ) for all data points and then projects the transformed data into the lower-dimensional space. KPCA provides better results when we are interested in nonlinear relation in the dataset. The kernel function is defined as k(x ; x ) = f(x )f(x ) . Two well-known kernels are polynomial kernel k(x ; k ) = i j i j i j T m jjx x jj =2g i j (x x + c) and Gaussian Kernel k(x ; k ) = e , where m and g are the free parameters. The best parameters then i i j could be found using the cross-validation technique. In this work, we used the polynomial kernel to reduce the dimension of the data and compared the results with the other methods. To implement the kernel PCA, we should do the following: Compute the Gram Matrix K where K = k(x ; x ). Note that the dataset is centralized and has zero mean. The Gram i j i j Matrix is as below: K = K1 K K1 +1 K1 (S6) N N N N where 1 is a N N matrix with all elements equal to 1=N. Find the basis vectors of the transformed space by using eigen-decomposition of the Gram Matrix. Project the data points on the first d eigenvectors with higher eigenvalues. 12/17 Linear PCA Kernel PCA 𝜑 (𝑥 ,𝑦 ) Figure S1. Linear PCA projects the data points onto the direction of largest principal component. Kernel PCA, however, maps the data points into another space using a nonlinear kernel function and then projects them on principal components of the new space. S3 Autoencoder Autoencoder is a neural-based dimensionality reduction network. Figure 2 shows a schematic of an autoencoder with a multilayered NN to map the high-dimensional input on the leftmost layer to the low-dimensional data in the middle layer. The same NN can be used to decode and recover the data back to the original space with a specific error. Actually, the data from the input layer are first compressed and subsequently are uncompressed into those closely matches the original data. For the simplest case(i.e. mono-layer encoder and mono-layer decoder) high-dimensional and low-dimensional data relation are: z = s(W x + b); (S7) 0 0 0 0 X = s (W x + b ); (S8) 0 0 0 Which x, z, W , W , b, b , s , and s represent high-dimensional data, low-dimensional data, encoder weight matrix, decoder weight matrix, encoder bias, decoder bias, encoder activation function, decoder activation function respectively .To find the optimum weights for autoencoder The MSE loss function should be minimized: 0 2 0 0 0 2 L =jjX X jj =jjX s (W (s(W x + b)) + b )jj ; (S9) S4 Comparison of PCA, KPCA, and Autoencoder Figure S2 and S3 demonstrate the performance of PCA, KPCA, and autoencoder for reconstructing the reflection spectra from the reduced space. The results represent the effectiveness of the dimensionality reduction in recovering reflection spectra after finding the optimum low-dimensional space. References 1. Jahani, S. & Jacob, Z. All-dielectric metamaterials. Nat. nanotechnology 11, 23 (2016). 2. Yu, N. et al. Light propagation with phase discontinuities: generalized laws of reflection and refraction. science 1210713 (2011). 3. Arbabi, A., Horie, Y., Bagheri, M. & Faraon, A. Dielectric metasurfaces for complete control of phase and polarization with subwavelength spatial resolution and high transmission. Nat. nanotechnology 10, 937 (2015). 4. Hsiao, H.-H., Chu, C. H. & Tsai, D. P. Fundamentals and applications of metasurfaces. Small Methods 1, 1600064 (2017). 13/17 5. Khorasaninejad, M. et al. Metalenses at visible wavelengths: Diffraction-limited focusing and subwavelength resolution imaging. Science 352, 1190–1194 (2016). 6. AbdollahRamezani, S., Arik, K., Khavasi, A. & Kavehvash, Z. Analog computing using graphene-based metalines. Opt. letters 40, 5239–5242 (2015). 7. Chizari, A., Abdollahramezani, S., Jamali, M. V. & Salehi, J. A. Analog optical computing based on a dielectric meta-reflect array. Opt. letters 41, 3451–3454 (2016). 8. Abdollahramezani, S., Chizari, A., Dorche, A. E., Jamali, M. V. & Salehi, J. A. Dielectric metasurfaces solve differential and integro-differential equations. Opt. letters 42, 1197–1200 (2017). 9. Campbell, S. D. et al. Review of numerical optimization techniques for meta-device design. Opt. Mater. Express 9, 1842–1863 (2019). 10. Sakurai, A. et al. Ultranarrow-band wavelength-selective thermal emission with aperiodic multilayered metamaterials designed by bayesian optimization. ACS central science 5, 319–326 (2019). 11. Pestourie, R. et al. Inverse design of large-area metasurfaces. Opt. express 26, 33732–33747 (2018). 12. Ma, W., Cheng, F. & Liu, Y. Deep-learning-enabled on-demand design of chiral metamaterials. ACS nano 12, 6326–6334 (2018). 13. Chen, W. T. et al. High-efficiency broadband meta-hologram with polarization-controlled dual images. Nano letters 14, 225–230 (2013). 14. Taghinejad, M. et al. Ultrafast control of phase and polarization of light expedited by hot-electron transfer. Nano letters 18, 5544–5551 (2018). 15. Taghinejad, M. et al. Hot-electron-assisted femtosecond all-optical modulation in plasmonics. Adv. Mater. 30, 1704915 (2018). 16. Seidel, S. Y. & Rappaport, T. S. Site-specific propagation prediction for wireless in-building personal communication system design. IEEE transactions on Veh. Technol. 43, 879–891 (1994). 17. Gondarenko, A. & Lipson, M. Low modal volume dipole-like dielectric slab resonator. Opt. express 16, 17689–17694 (2008). 18. Håkansson, A. & Sánchez-Dehesa, J. Inverse designed photonic crystal de-multiplex waveguide coupler. Opt. Express 13, 5440–5449 (2005). 19. Ong, J. R., Chu, H. S., Chen, V. H., Zhu, A. Y. & Genevet, P. Freestanding dielectric nanohole array metasurface for mid-infrared wavelength applications. Opt. letters 42, 2639–2642 (2017). 20. Piggott, A. Y., Petykiewicz, J., Su, L. & Vuck ˇ ovic, ´ J. Fabrication-constrained nanophotonic inverse design. Sci. Reports 7, 1786 (2017). 21. Lu, J. & Vuck ˇ ovic, ´ J. Nanophotonic computational design. Opt. express 21, 13351–13367 (2013). 22. Su, L., Piggott, A. Y., Sapra, N. V., Petykiewicz, J. & Vuckovic, J. Inverse design and demonstration of a compact on-chip narrowband three-channel wavelength demultiplexer. ACS Photonics 5, 301–305 (2017). 23. Frellsen, L. F., Ding, Y., Sigmund, O. & Frandsen, L. H. Topology optimized mode multiplexing in silicon-on-insulator photonic wire waveguides. Opt. express 24, 16866–16873 (2016). 24. Piggott, A. Y. et al. Inverse design and implementation of a wavelength demultiplexing grating coupler. Sci. reports 4, 7210 (2014). 25. Englund, D., Fushman, I. & Vuckovic, J. General recipe for designing photonic crystal cavities. Opt. express 13, 5961–5975 (2005). 26. Molesky, S. et al. Inverse design in nanophotonics. Nat. Photonics 12, 659 (2018). 27. Mansouree, M. & Arbabi, A. Large-scale metasurface design using the adjoint sensitivity technique. In Conference on Lasers and Electro-Optics, FF1F.7, DOI: 10.1364/CLEO_QELS.2018.FF1F.7 (Optical Society of America, 2018). 28. Melati, D. et al. Mapping the global design space of integrated photonic components using machine learning pattern recognition. arXiv preprint arXiv:1811.01048 (2018). 29. Ma, W., Cheng, F. & Liu, Y. Deep-learning enabled on-demand design of chiral metamaterials. ACS nano (2018). 30. Baxter, J. et al. Plasmonic colours predicted by deep learning. arXiv preprint arXiv:1902.05898 (2019). 14/17 31. Liu, D., Tan, Y., Khoram, E. & Yu, Z. Training deep neural networks for the inverse design of nanophotonic structures. ACS Photonics 5, 1365–1369 (2018). 32. Peurifoy, J. et al. Nanophotonic particle simulation and inverse design using artificial neural networks. Sci. advances 4, eaar4206 (2018). 33. Liu, Z., Zhu, D., Rodrigues, S., Lee, K.-T. & Cai, W. A generative model for the inverse design of metasurfaces. Nano letters (2018). 34. Tahersima, M. H. et al. Deep neural network inverse design of integrated nanophotonic devices. arXiv preprint arXiv:1809.03555 (2018). 35. Zhang, T. et al. Spectrum prediction and inverse design for plasmonic waveguide system based on artificial neural networks. arXiv preprint arXiv:1805.06410 (2018). 36. Qu, Y., Jing, L., Shen, Y., Qiu, M. & Soljacic, M. Migrating knowledge between physical scenarios based on artificial neural networks. arXiv preprint arXiv:1809.00972 (2018). 37. Inampudi, S. & Mosallaei, H. Neural network based design of metagratings. Appl. Phys. Lett. 112, 241102 (2018). 38. Yao, K., Unni, R. & Zheng, Y. Intelligent nanophotonics: merging photonics and artificial intelligence at the nanoscale. arXiv preprint arXiv:1810.11709 (2018). 39. So, S., Mun, J. & Rho, J. Simultaneous inverse design of materials and parameters of core-shell nanoparticle via deep-learning: Demonstration of dipole resonance engineering. arXiv preprint arXiv:1904.02848 (2019). 40. Asano, T. & Noda, S. Optimization of photonic crystal nanocavities based on deep learning. Opt. express 26, 32704–32717 (2018). 41. Peurifoy, J. et al. Nanophotonic particle simulation and inverse design using artificial neural networks. Sci. advances 4, eaar4206 (2018). 42. Liu, Z., Zhu, D., Rodrigues, S. P., Lee, K.-T. & Cai, W. Generative model for the inverse design of metasurfaces. Nano letters 18, 6570–6576 (2018). 43. Qu, Y., Jing, L., Shen, Y., Qiu, M. & Soljacic, M. Migrating knowledge between physical scenarios based on artificial neural networks. arXiv preprint arXiv:1809.00972 (2018). 44. Pearson, K. Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, Dublin Philos. Mag. J. Sci. 2, 559–572 (1901). 45. Schölkopf, B., Smola, A. & Müller, K.-R. Kernel principal component analysis. In International conference on artificial neural networks, 583–588 (Springer, 1997). 46. Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. science 313, 504–507 (2006). 47. Friedman, J., Hastie, T. & Tibshirani, R. The elements of statistical learning, vol. 1 (Springer series in statistics New York, 2001). 48. Kiarashinejad, Y., Abdollahramezani, S. & Adibi, A. Deep learning approach based on dimensionality reduction for designing electromagnetic nanostructures. arXiv preprint arXiv:1902.03865 (2019). 49. Shportko, K. et al. Resonant bonding in crystalline phase-change materials. Nat. materials 7, 653 (2008). 50. Abdollahramezani, S. et al. Reconfigurable multifunctional metasurfaces employing hybrid phase-change plasmonic architecture. arXiv preprint arXiv:1809.08907 (2018). 15/17 Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency(THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Figure S2. Comparison of the reflection spectra of the original and reconstructed data after reducing the dimensionality of the response space employing different methods (PCA, KPCA, and autoencoder) with d = 7. 16/17 Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflectance Amplitude Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Figure S3. Comparison of the reflection spectra of the original and reconstructed data after reducing the dimensionality of the response space employing different methods (PCA, KPCA, and autoencoder) with d = 2. 17/17 Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Statistics arXiv (Cornell University)

Deep Learning Reveals Underlying Physics of Light-matter Interactions in Nanophotonic Devices

Loading next page...
 
/lp/arxiv-cornell-university/deep-learning-reveals-underlying-physics-of-light-matter-interactions-AB6przOU6B
ISSN
2513-0390
eISSN
ARCH-3347
DOI
10.1002/adts.201900088
Publisher site
See Article on Publisher Site

Abstract

Deep Learning Reveals Underlying Physics of Light-matter Interactions in Nanophotonic Devices 1,a 1,a 1 Yashar Kiarashinejad , Sajjad Abdollahramezani , Mohammadreza Zandehshahvar , 1 1,b Omid Hemmatyar , and Ali Adibi School of Electrical and Computer Engineering, Georgia Institute of Technology, 778 Atlantic Drive NW, Atlanta, GA 30332, USA These authors contributed equally to this work ali.adibi@ece.gatech.edu ABSTRACT In this paper, we present a deep learning-based (DL-based) algorithm, as a purely mathematical platform, for providing intuitive understanding of the properties of electromagnetic (EM) wave-matter interaction in nanostructures. This approach is based on using the dimensionality reduction (DR) technique to significantly reduce the dimensionality of a generic EM wave-matter interaction problem without imposing significant error. Such an approach implicitly provides useful information about the role of different features (or design parameters such as geometry) of the nanostructure in its response functionality. To demonstrate the practical capabilities of this DL-based technique, we apply it to a reconfigurable optical metadevice enabling dual-band and triple-band optical absorption in the telecommunication window. Combination of the proposed approach with existing commercialized full-wave simulation tools offers a powerful toolkit to extract basic mechanisms of wave-matter interaction in complex EM devices and facilitate the design and optimization of nanostructures for a large range of applications including imaging, spectroscopy, and signal processing. It is worth to mention that the demonstrated approach is general and can be used in a large range of problems as long as enough training data can be provided. Keywords: deep learning, physical understanding, dimensionality reduction, nanophotonics, metamaterials, plasmonics 1 Introduction To manipulate the inherent properties (e.g., amplitude, phase, polarization, and frequency) of electromagnetic (EM) waves in the subwavelength regime, nanophotonic structures (especially metamaterials and metasurfaces) have emerged as a promising 1–4 candidate . The development of reliable fabrication techniques to realize such nanostructures has opened up new opportunities for forming reliable flat optical components to replace the existing bulky optical elements. Numerous interesting functionalities 5 6–12 13 have been demonstrated so far including planar lenses , calculus metasurfaces , meta-holograms , and nonlinear meta- 14, 15 modulators . However, systematic realization of mature optical functionalities using complex nanostructures requires significant knowledge about the influence of nanostructure features on the interaction of EM waves, which currently can only be found using cumbersome numerical calculations. Despite extensive efforts in forming new approaches for design and 9 16 17, 18 optimization of nanostructures (e.g., using brute-force techniques , evolutionary techniques like genetic algorithms or 19 ?, 20–27 28 particle swarm optimization , semi-analytical modeling , pattern recognition method , and even neural network-based ?, 29, 29–35, 35–40 techniques ), systematic approaches for understanding the physics of wave-matter interaction in nanostructures and/or the effect of structural properties on their output response are still missing. On the other hand, available design and optimization approaches (e.g., the brute-force techniques) either suffer from significant computation complexity or over-simplify the problem in both response and design domains (e.g., due to sever down-sampling). Such computation complexity and oversimplification hinders the use of the existing design tools to provide a detailed understanding of the dynamics of light-matter interaction inside nanostructures unless a large set of simulations is performed. Here, we present a new design and optimization approach based on deep learning (DL) that provides detailed information about the role of design parameters in the output response of any nanostructure as well as intuitive understanding of the physics of wave-matter interaction in these nanostructures without imposing stringent computation complexity. Our approach is based on reducing the dimensionality of the problem in both design and response spaces while preserving the vital information. By training proper neural networks (NNs) for implementation of these dimensionality reduction (DR) processes, we find a complex analytic formula that relates the design parameters to the output response of the nanostructure. Despite their inherent complexity, such analytic relationships provide valuable intuitive information about the role of each design parameter in the overall response of the nanostructure at minimal computation costs. This is in contrast to existing design and optimization techniques, which arXiv:1905.06889v1 [physics.optics] 7 May 2019 Set of Random Design Parameters Reducing Reducing EM Full-Wave Dimensionality of Dimensionality of Simulation Response Space Design Space Set of Random Responses Reduced Design Parameters Response Space Extracting Underline Physics Figure 1. Procedure of revealing the underlying physics of a generic nanostructure using the DR algorithm. First, a set of random values is fed into the full-wave EM solver as design parameters. The output response (e.g., reflection spectra) of the corresponding simulated structures are lunched to a DR algorithm forming the reduced response space. In the next step, the reduced response-space data points and the associated random design parameters are used for training of a pseudo-encoder for DR of the design space. Finally, the trained pseudo-encoder provides physical intuition of the investigated EM structure and the role of each design parameter on the overall output response. require costly iterative computations for each design problem without providing intuitive understanding about the role of design 41–43 parameters . This approach also allows for trading off the computation accuracy and complexity. Thus, it can be used for obtaining quick high-level information about the physics of light-matter interaction (e.g., the role of a given design parameter on the overall device performance) or running longer simulations to achieve more detailed information about a specific feature of the structure. By providing the required information for intuitive understanding of the light-matter interaction or design of a class of patterned nanostructures for any desired application with orders of magnitude less computation complexity, this approach can have a transformative impact on several applications that rely on nanostructures including imaging, spectroscopy, signal analysis, sensing, and LiDAR among others. The design and optimization approach and its important properties are discussed in Section 2. The results of the application of this technique to practical nanostructures is presented in Section 3. Understanding the underlying physics of investigated nanostructures as well as more detailed properties of this technique is discussed in Sections 4 and 5, and final conclusions are made in Section 6. 2 Deep learning-based Approach for Design and Optimization of Nanostructures Figure 1 shows the schematic representation of our platform for analysis, design, optimization, and understanding the physics of nanostructures. Our main focus in this paper is to use such a simulation platform to extract the underlying physics of light-matter interaction through running a complete design and optimization process. Our approach (depicted in Fig. 1) uses the high level of correlation of light-matter interaction in the spatial and spectral domains to considerably reduce the dimensionality of the response space of the problem. Furthermore, the correlation that often exists among the effect of structural design parameters on the response space is used to reduce the dimensionality of the design space. In the first step, a full-wave EM simulation software (based on the finite element method (FEM) implemented in the COMSOL environment, unless otherwise stated) is used to provide sufficient number of randomly generated instances (or so-called the input dataset) to train our DR approach. Each instance is calculated using a given set of randomly selected design parameters (i.e., a point in the design space), and thus, it relates the design space to the response space. After feeding a DR network using a subset of the available training data, we reduce the dimensionality of the response space. By setting the level of acceptable error, we find the minimum accessible dimensionality of the reduced response space. These reduced features are related to the intact dimensions in the original response space through the analytic formulas provided by the dimensionality expansion methods. 2/17 Encoder Decoder . . . . Output Input . . . . . . . Bottleneck Figure 2. Schematic representation of an autoencoder architecture used in the DR technique. The leftmost half part (i.e., the encoder) reduces the dimensionality (the bottleneck layer represents the reduced space) while the right part (i.e., the decoder) recovers the data from the reduces space back to the original space. x and x ˆ represent inputs and outputs, respectively. i i 44 45 46 We applied principal component analysis (PCA) , kernel PCA (KPCA) , and autoencoder to reduce the dimensionality of our response space (more details are provided in the Supplementary Information). PCA is a linear DR algorithm that projects the data points on the eigenvectors of the covariance matrix of the responses. During the projection, first d eigenvectors with the highest eigenvalues are selected, where d represents the dimensionality of the reduced response space . KPCA is the nonlinear version of PCA in which a kernel function maps datapoints using a nonlinear function and then projects the datapoints on the basis vectors of the covariance matrix of the kernelized space . As another DR method, the autoencoder (shown in Fig. 2) encodes the high-dimensional input on the leftmost part to low-dimensional data in the middle layer using a multilayer NN. The same NN can be used to decode and recover the data (back to the original response space) with some error . In other words, autoencoder is a feedforward NN in which it has same number of inputs and outputs. The number of neurons in the middle layer represents the dimension of the low-dimensional data represent the desired reduced dimensionality. This layer is also known as the bottleneck of the autoencoder. As shown in Section 3, our simulations show that the performance of the autoencoder surpasses those of the PCA and KPCA. After training an autoencoder for the DR of the response space (see Fig. 3(a)), we reduce the dimensionality of the design space using a pseudo-encoder architecture (see Fig. 3(b)) to relate the design and response spaces with minimal computation complexity. Since the input and output of the DR mechanism in Fig. 3(b) are different parameters (in contrast to the case for an autoencoder), we call this architecture a pseudo-encoder. In this manner, we directly include the information about the response of the nanostructure into the reduced design space, which is a major advantage of the pseudo-encoder architecture. Once the DR in both spaces are performed, we form a complete NN-based architecture that directly relates the design parameters to the nanostructure response by integrating the two trained NNs for the DR algorithms shown in Figs. 3(a) and 3(b). Once the underlying NNs are trained, we will obtain complex analytic formulas to study in details the roles of the design parameters in the output features. In addition, the weights of the NNs at different layers in for both the autoencoder and the pseudo-encoder can provide valuable information about the role of design parameters on the output response. 3 Analysis of Nanostructures Using the DR-based Technique To show the applicability of the proposed approach, we consider here two simple design problems for the implementation of a reconfigurable multifunctional metadevice enabling dual-band and triple-band absorption in the telecommunication window. Figure 4 shows the schematic of the supercell structure of the metadevice, which can be electrically tuned to obtain the desired reflection spectrum when illuminated with a TM-polarized light (i.e., magnetic field normal to the direction of grating). Considering the maximum sampling value for the periodicity, the supercell in this design can consist of up to two unit cells to effectively suppress the higher diffraction orders in the telecommunication window. Each unit cell is comprised of a gold (Au) nanoribbon incorporating germanium antimony telluride (GST), a well-developed phase-change alloy. Upon non-volatile conversion of GST from the amorphous to the full crystalline state, a drastic change happens in its refractive index, which consequently induces a remarkable change in the reflection response. Meanwhile, the intermediate phase transition of GST can be realized by exciting it with an external stimulus (e.g., an electrical current). The nanostruture in Fig. 4 has 7 design parameters, i.e., the widths of the two Au nanoribbons in the supercell (w and w ), unit cell periodicities ( p and p ), the 1 2 1 2 3/17 (a) Reduced Optical Response Optical Response Encoder Response Decoder Features Features Features (b) Reduced Reduced Optical Design Concatenated Neural Design Pseudo-encoder Response Parameters Layers Parameters Features Figure 3. Reducing dimensionality of response and design spaces using autoencoder and pseudo-encoder platforms, respec- tively. (a) Reducing the optical response feature and represent optical response in reduced response space. (b) Architecture of a pseudo-encoder, that maps the design parameters to the reduced design parameters. % % crystalline states of the two GST nanostripes (m and m in which the superscript number represents the crystallization 1 2 fraction), and the thickness of the GST nanostripes (h). It is notable that a specific crystallization fraction (which is associated to the refractive index of GST) can be realized by applying a predefined gate voltage. Here, we assume the thicknesses of the silicon dioxide (SiO ) layer and Au nanoribbons are defined by the fabrication limitations, and thus, we do not consider it as a separate design parameter. The training of the DR algorithm for the response space is performed with 1700 instances obtained using the FEM simulations for structures with randomly selected design parameters. To obtain these instances, the reflection spectrum (i.e., reflection as a function of frequency) of the structure in Fig. 4 is calculated and sampled over the 150-300 THz range (with 3.75 THz spacing between adjacent samples) to obtain the response-space results. Thus, the number of parameters in the design space is 7, and the number of samples in the response space is 40. Details of these simulations are provided in Methods. In addition, we simulated 300 extra structures (with randomly selected design parameters) to obtain the validation dataset. We applied the three DR algorithms (discussed on Section 2) with different number of reduced dimensions to the training dataset and tested them with the (unseen) validation dataset. The mean squared error (MSE) for different number of dimensions (d) in the reduced response space for the three DR algorithms is represented in Fig. 5(a). In these simulations, the polynomial kernel with degree of 7 is selected for the KPCA method. The autoencoder (see Fig. 2) consists of 7 layers in total and the number of nodes in the hidden (or intermediate) layers are 40, 30, 20, d, 20, 30, and 40, respectively. Here d represents the dimension of the reduced response space, which is the number of nodes in the bottleneck of the autoencoder in Fig. 2. The activation function for all nodes is fixed to tangent sigmoid function. As it is shown in Fig. 5(a), the autoencoder outperforms PCA and KPCA for all values of d. This reveals the effectiveness of autoencoder in keeping nonlinear properties of the response space. KPCA works slightly better than PCA for lower dimensions (d); however, it has higher MSE as the dimensionality increases because of overfitting. Figure 6 represents the reconstructed spectra using different DR methods for three values of d (d = 2, 7, and 16 for Figs. 6(a), 6(b), and 6(c), respectively, with respective errors for different cases shown in Figs. 6(d), 6(e), and 6(f)). As seen from Fig. 6(b), the autoencoder is able to reconstruct the response spectrum after reducing the dimensionality of the response space from 40 to 7. Figure 5(a) also confirms that the autoencoder with d = 7 is a good choice for the DR of the response space with MSE < 0.05. In the next step, we train the pseudo-encoder with the training dataset and test it with the validation data set using the approach discussed in . To simplify the computation, we only consider one layer for the encoder part of the pseudo-encoder. Figure 5(b) shows the MSE as a function of the dimension in the reduced design space D. It is clear that by reducing the dimension of the design space from 7 to 4, MSE < 0.02 is achieved. Figure 7(a) shows the pseudo-encoder architecture with the design parameters and the reduced response space being its 4/17 Figure 4. Three-dimensional (3D) illustration of the hybrid plasmonic/phase-change material metasurface studied in this paper. The design parameters are the thickness of GST nanostripes (h), the crystallization fraction of the GST nanostripes (m and m ), the unit cell periodicities ( p and p ), and Au nanoribbon widths (w and w ) while the thickness of the Au nanoribbon is 1 2 1 2 fixed at 30 nm. Each supercell can consist of one or two unit cells each comprised of a GST nanostripe encapsulated between % % the Au nanoribbon and the Au back-reflector separated from each other by symmetric SiO spacers. m and m are dependent 1 2 on the gate voltages V and V , respectively. The whole structure is illuminated with a TM-polarized light at the near-infrared 1 2 frequency range. input and output, respectively. The weights of the first layer in pseudo-encoder represent the importance of different design parameters. In this manner, each input node (corresponding to each deign parameter) is connected to the nodes in the second layer with the strengths shown by the weights. Thus, design parameters with more significant roles have larger weights in this layer. Figure 7(b) shows the weights in the first layer of the pseudo-encoder. It is clear that the GST thickness h has the strongest role in the output response. Thus, the response of the structure is more sensitive to this parameter. Besides, w (i = 1; 2) and p (i = 1; 2) have almost similar accumulative intensities (or weights) and thus, similar influence on the response space. It is clear that the response space is slightly affected by the less important design parameter m (i = 1; 2). This understanding of the relative importance of the design parameters in the output response is very helpful for initializing any wise optimization process, even with traditional approaches. 4 Understanding the Physics of Light-matter Interaction in Nanostructures Herein, we illustrate that the interpretation of the weights of the pseudo-encoder can effectively reveal the underlying physics of light-matter interactions in nanostructures. For this purpose, we perform a comprehensive analysis of the fundamental modes of the metasurface in Fig. 4 using full-wave EM simulations in the given frequency range. Further information on the material properties as well as details of the FEM simulation process are provided in the Methods. Figure 8(a) shows upon excitation of a unit cell of the structure in Fig. 4 with a TM-polarized light, the Au nanoribbon/GST nanostripe and GST nanostripe/Au back-reflector can support short-range surface plasmons (SR-SPPs), which are EM waves bound to and travel along the metal-dielectric interface with short propagation lengths. It should be noted that the effective index contrast at the interfaces of Au nanoribbon end-faces and air implies that each nanoribbon approximately acts as a lateral mirror-like cavity. Accordingly, a constructive interference happens between SR-SPP modes travelling back and forth (i.e., in the x-direction in Fig. 4) between the two ends of the Au nanoribbon. Furthermore, the difference between the refractive indices of GST and SiO enhances the effect of the lateral Fabry-Perot cavity in the intermediate GST nanostripe at the interface between the GST and the (lower) Au back-reflector plane. Thus, a similar mode profile exists within that region, which can be ascribed to a confined constructive SPP. 5/17 −2 ·10 0.4 8 Autoencoder PCA Kernel PCA 0.3 0.2 0.1 1 2 3 4 5 6 7 2 4 6 8 10 12 14 16 18 Dimensionality Dimensionality (a) (b) Figure 5. (a) MSE for different DR algorithms on the response space. The autoencoder has 7 layers with dimensions 40, 30, 20, d, 20, 30, and 40, where d represents the number of reduced dimensions. The KPCA is trained with a polynomial kernel of degree 7. (b) MSE of the pseudo-encoder for different dimensions of the reduced design space. The number of nodes (or the dimension) of the different layers of the pseudo-encoder are 7, d, 10, 20, 20, 30, 30, 40, and 7 at each layer where d represents the dimensionality of the reduced design space. The two aforementioned SPPs are the fundamental modes of the metasurface defining any arbitrary response from the structure. We expect that the parameter simultaneously modifying the field profiles of these modes plays the key role in engineering the spectral response of the metasurface. The DR algorithm introduces h as the most influential parameter (see Fig. 7(b)). To justify such a claim, we study the effect of three different h values on the field profile of the metasurface for a fixed set of other parameters. Figure 8(b) shows that when the two Au-GST interfaces (at the top and bottom of the GST nanostripe) are far (i.e., large h), the fundamental modes are spatially separated in a unit cell. By decreasing the distance of these interfaces (i.e., h), the coupling between the SR-SPP and the confined SPP modes sustained by individual interfaces increases until these highly coupled modes form a supermode as the dominant mode of the structure (see Fig. 8(a)). Fig. 8(c) shows that further decrease in parameter h results in fading of one of the modes. To further verify our conclusion, we finely sweep h and present the reflection spectrum in Fig. 10(a). This figure illustrates that by changing h, no abrupt change happens in the reflection spectrum profile. Such a gradual transition verifies the absence of the well-known gap-surface-plasmon resonance, a highly confined magnetic mode, which originates from the circulating displacement current between a metal nanoribbon and a metal back-reflector. This can be ascribed to the remarkable refractive index of GST in any crystallization fraction. Thus, we firmly conclude the metasurface only supports the two above-mentioned SPP modes (and no gap-surface-plasmon mode). This makes h the most important design parameter with the maximum influence on the variation of the reflection spectrum. Figure 7(b), shows that w (i = 1; 2) and p (i = 1; 2) have the secondary dominant effects on the EM response after h. i i This is in-line with the well-known fact that the resonance frequency of a SPP mode is highly dependent on the width of the nanocavity. Moreover, we found that even by continuously increasing w, only the well-known odd order SPP modes could be excited (see Fig. 9) . This observation justifies that variation of w does not change the inherent nature of these individual SPP modes and keeps them rather decoupled. Figure 10(b) corroborates that while other parameters in a unit cell are fixed, decreasing (increasing) the width (w) reasonably blueshifts (redshifts) the resonance. However, the reflection response is not as sensitive to w as to h since the nature of both individual SPP modes remains intact while w is varied. More apparently, by 0% modifying the width around w = 350 nm while having other parameters fixed (i.e., h = 170 nm, p = 580 nm, and m ), the resonance profile in Fig. 10(b) has a broader linewidth compared to its counterpart in Fig. 10(a). This comparison well justifies the more sensitive nature of the reflection spectrum to h. On the role of p, it is notable that light diffraction from the surface of the structure is the origination of the confined SPP mode excited at the interface of the GST nanostripe/Au back-reflector. As a result, the behavior of the overall reflection spectrum relies on the periodicity (i.e., p) of each unit cell. Figure 10(c) illustrates that by increasing p, the lower portion of the incident light couples to the confined SPP mode, and some part of it reflects in the form of higher diffraction orders. On the other hand, decreasing the periodicity can reasonably increase the coupling of adjacent unit cells, which changes the overall reflection response. Finally, Fig. 7(b) shows that m (i = 1; 2) has the minimum effect on the EM response among the investigated parameters. 6/17 Mean Squared Error (MSE) Mean Squared Error (MSE) −1 Autoencoder Original Data PCA Autoencoder KPCA PCA −2 KPCA 10 0.8 0.6 −3 0.4 −4 0.2 −5 0 10 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency(THz) (a) (b) −1 Autoencoder Original Data PCA Autoencoder KPCA PCA −2 KPCA 10 0.8 0.6 −3 0.4 −4 0.2 −5 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency(THz) Frequency(THz) (c) (d) −1 Autoencoder Original Data PCA Autoencoder KPCA PCA −2 KPCA 10 0.8 0.6 −3 0.4 −4 0.2 −5 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency(THz) (e) (f) Figure 6. Reconstructed reflection amplitude (left) and reconstruction MSE versus frequency (right) for dimensionality reduction using PCA, KPCA, and autoencoder. Results for reducing the dimensionality from 40 to (a),(b) 2, (c),(d) 7, and (e),(f) 16. The hyper-parameters for KPCA and autoencoder are the same as parameters used in fig. 5. 7/17 Reflection Amplitude Reflection Amplitude Reflection Amplitude Mean Squared Error (MSE) Mean Squared Error (MSE) Mean Squared Error (MSE) ℎ (a) (b) Figure 7. Detailed architecture of the adopted pseudo-encoder. (a) The number of nodes of the pseudo-encoder are 7, 4, 10, 20, 20, 30, 30, 40, and 7 at each layer. (b) Weights of the first layer of the pseudo-encoder in (a) which is yellow highlighted. This is also seen from Fig. 10(d) as slightly changing m (while keeping all other parameters fixed) results in no major change in the reflection spectrum amplitude. Nevertheless, Fig. 10(d) suggests that the location of the minimum of the reflection % % spectrum (or the absorption peak) depends on m ; increasing (decreasing) m (i.e., larger GST refractive index) results in a red (blue) shift in the absorption peak. Accordingly, choosing a unit cell with a proper crystallization fraction (m ) ensures near-unity absorption at the desired frequency. More importantly, a structure with a supercell with two different values of m % % (m and m in Fig. 4) can have a multi-band absorption governed by the constructive and/or destructive overlap between the 1 2 distinct resonance peaks corresponding to the two values of m . This design approach can be extended to structures with more sophisticated supercells to form more complex absorption spectra. H E H E H E y x y x y x k k z z (a) (b) (c) Figure 8. EM field distributions in a unit cell of the structure in Fig. 4. Magnetic field (presented by the thermal colormap) and electric field profile (represented by arrows and coded by the rainbow colorbar) for a unit cell with (a) h = 150 nm, (b) h = 60% 250 nm, and (c) h = 50 nm, respectively. The other structural parameters are fixed as p = 550 nm, w = 340 nm, and m . The frequency of incident TM-polarized light is f = 194 THz. 5 Discussion The intuitive understanding of the role of design parameters on the response of a given nanostructure obtained in Section 4 can facilitate the design of more sophisticate nanostructures using any optimization technique. The computation complexity of any optimization approach depends heavily on the discretization of the values of different design parameters. Our approach suggests that maximum discretization shall be used for the most influential design parameter (e.g., h in the structure in Fig. 4) while a more sparse discretization is acceptable for less important design parameters (e.g., m (i = 1; 2) in the structure 8/17 H E H E H E y x y x y x k k z z (a) (b) (c) Figure 9. EM field distributions in a unit cell of the structure in Fig. 4. Magnetic field (presented by the thermal colormap) and electric field profile (represented by arrows and coded by the rainbow colorbar) for a unit cell with (a) w = 100 nm (first order mode), (b) w = 300 nm (third order mode), and (c) w = 500 nm (fifth order mode), respectively. The other structural 60% parameters are fixed as p = 550 nm, h = 150 nm, and m . The frequency of incident TM-polarized light is f = 194 THz. in Fig. 4). To test this approach, we used our findings in Section 4 about the structure in Fig. 4 to design a multifunctional metadevice providing dual-band absorption (at two distinct frequencies of f = 195 THz and f =235 THz) and triple-band 1 2 absorption (at three distinct frequencies of f = 170 THz, f = 205 THz, and f = 240 THz). For device optimization, we used 1 2 3 the exhaustive search of design parameters using the analytic formulas obtained by the trained system that relates the design space to the response space with DR. We also used non-uniform discretization of the design parameters through our findings in Section 4 to minimize the computation complexity. The optimized supercell offered by this approach has two unit cells 0% 50% 0% 0% with similar structural parameters h = 180 nm, p = 550 nm (i = 1; 2), and w = 340 nm (i = 1; 2). (m ,m ) and (m ,m ) i i 1 2 1 2 are obtained for the dual- and the triple-band absorption functionalities, respectively. Figure 11 shows the response of the % % designed multifunctional metadevice. As shown, by slightly changing m and m , the locations of the absorption peaks do 1 2 not considerably change around the optimized values. This is anticipated regarding the discussion in Section 4; the overall % % reflection response of the structure has minimum sensitivity to m and m and thus, it is robust against random variations of 1 2 external stimulus (here the gate voltage) or other destructive environmental effects (such as GST oxidization). Another important observation is that by retraining the algorithm using a different set of training data (for the same structure), the weights slightly change, but the trends of their variations remain the same. This means that the intuitive understanding of the roles of the designing parameters in its response is to a good degree independent of the training process. We repeated this process at least 20 times with different sets of training data for the structure in Fig. 4 and found the same conclusions in all trials. Nevertheless, we think that for sophisticated nanostructures with many design parameters, training the algorithm with different training sets may reveal different information about the device operation. This is especially valuable for the complex nanostructures in which simple simulations (e.g., like the ones that resulted in Fig. 11) cannot be used to understand the role of design parameters. It is also not practical to simulate enough structures to learn the role of a specific design parameter due to computation complexity of such complex nanostructures. It is also worth mentioning that the technique discussed here is not limited to nanostructures; it can be extended to cover many different problems (e.g., fluid mechanics, heat transfer, acoustic wave propagation, etc.) as long as enough training data can be provided. 6 Conclusion We demonstrated here a DL-based technique for the understanding of the physics of wave-matter interaction in nanostructures. By using the DR algorithm in the response space and the design space (using an autoencoder and a pseudo-encoder, respectively), we could obtain an analytic formula that relates the design parameters to the response of the nanostructure while providing access to the weights of the neural notworks at all layers. By analyzing these weights, important information about the roles of different design parameters in the overall response of the nanostructure can be obtained. This intuitive information can be used to understand the physics of light-matter interaction while facilitating the device optimization process by suggesting a non-uniform discretization of the design parameters to reduce the computation requirements. As such, the approach presented here can have an important impact in the design and understanding of the EM wave-matter interaction in nanostructures while being extendable to several other applications. 9/17 (a) (b) (c) (d) Figure 10. Reflection amplitude profile of a unit cell of the metasurface shown in Fig. 4 with different structural parameters under illumination of a TM-polarized light. Reflection amplitude versus (a) GST nanostripe thicknesses while geometrical 0% parameters are chosen p = 580 nm, w = 350 nm, and m , (b) Au nanoribbon widths while other geometrical parameters are 0% fixed as h = 170 nm, p = 580 nm, and m , (c) periodicity of the unit cell while other geometrical parameters are chosen h = 0% 170 nm, w = 350 nm, and m , and (d) crystallization fraction while geometrical parameters are fixed as h = 170 nm, p = 580 nm, and w = 350 nm. 10/17 (a) (b) Figure 11. Absorption spectra of the designed optimized multifunctional metadevice. (a) Spectra of the original dual-band absorber (solid blue line) at two distinct frequencies of f = 195 THz and f =235 THz, and (b) triple-band absorber (solid blue 1 2 line) at three distinct frequencies of f = 170 THz, f = 205 THz, and f = 240 THz. Red dashed curves justify that by slightly 1 2 3 modifying m (i = 1; 2) around the optimized values, the spectra change moderately. Methods All full-wave EM simulation results shown in the main text were obtained by using COMSOL Multiphysics 5.3, a commercial- ized full-wave simulation package based on the FEM. The proposed metasurface in Fig. 4 was simulated in a two-dimensional environment with periodic boundary conditions in the y direction. The structure is assumed infinite in the x direction and was excited with a TM-polarized plane-wave propagating in the +z direction. The optical properties of the amorphous and the fully crystalline GST were obtained from and those of the intermediate states were calculated using the well-known Lorentz-Lorenz relation formulated as : e ( f ) 1 e ( f ) 1 e ( f ) 1 e f f c a % % = (m =100) + ((m =100) 1) ; (1) e ( f ) + 2 e ( f ) + 2 e ( f ) + 2 e f f c a where for a specific frequency f , e ( f ) and e ( f ) are the permittivities of the fully crystalline and the amorphous GST, c a % 0% 100% respectively, and m , ranging from 0% (i.e., m or amorphous) to 100% (i.e., m , fully crystalline), is the crystallization fraction of GST. The optical properties of other materials were obtained from . 11/17 Supplementary Information S1 Principal Component Analysis (PCA) PCA is a linear dimensionality reduction method, which maps the data into a linear subspace so that the variance is maximized. In other words, PCA projects data points onto the eigenvectors of the covariance matrix of the data points with larger eigenvalues. PCA results in minimum MSE during reconstruction. Thus, it gives the best representation of the data in the lower-dimensional space in terms of MSE. Considering X as the response space matrix whose rows are samples of the reflection amplitude for an specific design, the first step in this algorithm is to centralize the dataset (i.e., subtract mean from data points): X = X X ; (S1) mean where X is the mean matrix of the data points, and X is the centralized matrix. The eigenvectors of the covariance matrix of mean X represent the principal components. These vectors are found using the singular value decomposition as: X = USV ; (S2) ˆ ˆ Here, columns of matrix U represent the basis vectors of the covariance matrix X X . In addition, S is a diagonal matrix, and its diagonal elements (s ) are the singular values associated with the columns of U . Therefore, by keeping the first k columns of U with the largest singular values, the projected matrix is: ˆ ˆ PX = U X; (S3) where U contains the first d columns of U and PX is the projection of the data in the lower-dimensional space. To reconstruct the projected response, we first make the projection inverse and then add the mean to the matrix as: RX = U PX + X ; (S4) d mean Finally, the reconstruction error is jjX RXjj ; (S5) å i i i=1 th where N is the number of data points, and X represents the i row of the response-space matrix. S2 Kernel PCA (KPCA) The nonlinear version of PCA is known as kernel PCA (KPCA). Figure S1 shows how PCA and KPCA project data points into the lower-dimensional space. As we discussed before, PCA projects the data points on a linear subspace. However, if there are nonlinear properties in the dataset, PCA might result in a poor performance. KPCA transforms the original data into a higher-dimensional space using a nonlinear mapping f(x ) for all data points and then projects the transformed data into the lower-dimensional space. KPCA provides better results when we are interested in nonlinear relation in the dataset. The kernel function is defined as k(x ; x ) = f(x )f(x ) . Two well-known kernels are polynomial kernel k(x ; k ) = i j i j i j T m jjx x jj =2g i j (x x + c) and Gaussian Kernel k(x ; k ) = e , where m and g are the free parameters. The best parameters then i i j could be found using the cross-validation technique. In this work, we used the polynomial kernel to reduce the dimension of the data and compared the results with the other methods. To implement the kernel PCA, we should do the following: Compute the Gram Matrix K where K = k(x ; x ). Note that the dataset is centralized and has zero mean. The Gram i j i j Matrix is as below: K = K1 K K1 +1 K1 (S6) N N N N where 1 is a N N matrix with all elements equal to 1=N. Find the basis vectors of the transformed space by using eigen-decomposition of the Gram Matrix. Project the data points on the first d eigenvectors with higher eigenvalues. 12/17 Linear PCA Kernel PCA 𝜑 (𝑥 ,𝑦 ) Figure S1. Linear PCA projects the data points onto the direction of largest principal component. Kernel PCA, however, maps the data points into another space using a nonlinear kernel function and then projects them on principal components of the new space. S3 Autoencoder Autoencoder is a neural-based dimensionality reduction network. Figure 2 shows a schematic of an autoencoder with a multilayered NN to map the high-dimensional input on the leftmost layer to the low-dimensional data in the middle layer. The same NN can be used to decode and recover the data back to the original space with a specific error. Actually, the data from the input layer are first compressed and subsequently are uncompressed into those closely matches the original data. For the simplest case(i.e. mono-layer encoder and mono-layer decoder) high-dimensional and low-dimensional data relation are: z = s(W x + b); (S7) 0 0 0 0 X = s (W x + b ); (S8) 0 0 0 Which x, z, W , W , b, b , s , and s represent high-dimensional data, low-dimensional data, encoder weight matrix, decoder weight matrix, encoder bias, decoder bias, encoder activation function, decoder activation function respectively .To find the optimum weights for autoencoder The MSE loss function should be minimized: 0 2 0 0 0 2 L =jjX X jj =jjX s (W (s(W x + b)) + b )jj ; (S9) S4 Comparison of PCA, KPCA, and Autoencoder Figure S2 and S3 demonstrate the performance of PCA, KPCA, and autoencoder for reconstructing the reflection spectra from the reduced space. The results represent the effectiveness of the dimensionality reduction in recovering reflection spectra after finding the optimum low-dimensional space. References 1. Jahani, S. & Jacob, Z. All-dielectric metamaterials. Nat. nanotechnology 11, 23 (2016). 2. Yu, N. et al. Light propagation with phase discontinuities: generalized laws of reflection and refraction. science 1210713 (2011). 3. Arbabi, A., Horie, Y., Bagheri, M. & Faraon, A. Dielectric metasurfaces for complete control of phase and polarization with subwavelength spatial resolution and high transmission. Nat. nanotechnology 10, 937 (2015). 4. Hsiao, H.-H., Chu, C. H. & Tsai, D. P. Fundamentals and applications of metasurfaces. Small Methods 1, 1600064 (2017). 13/17 5. Khorasaninejad, M. et al. Metalenses at visible wavelengths: Diffraction-limited focusing and subwavelength resolution imaging. Science 352, 1190–1194 (2016). 6. AbdollahRamezani, S., Arik, K., Khavasi, A. & Kavehvash, Z. Analog computing using graphene-based metalines. Opt. letters 40, 5239–5242 (2015). 7. Chizari, A., Abdollahramezani, S., Jamali, M. V. & Salehi, J. A. Analog optical computing based on a dielectric meta-reflect array. Opt. letters 41, 3451–3454 (2016). 8. Abdollahramezani, S., Chizari, A., Dorche, A. E., Jamali, M. V. & Salehi, J. A. Dielectric metasurfaces solve differential and integro-differential equations. Opt. letters 42, 1197–1200 (2017). 9. Campbell, S. D. et al. Review of numerical optimization techniques for meta-device design. Opt. Mater. Express 9, 1842–1863 (2019). 10. Sakurai, A. et al. Ultranarrow-band wavelength-selective thermal emission with aperiodic multilayered metamaterials designed by bayesian optimization. ACS central science 5, 319–326 (2019). 11. Pestourie, R. et al. Inverse design of large-area metasurfaces. Opt. express 26, 33732–33747 (2018). 12. Ma, W., Cheng, F. & Liu, Y. Deep-learning-enabled on-demand design of chiral metamaterials. ACS nano 12, 6326–6334 (2018). 13. Chen, W. T. et al. High-efficiency broadband meta-hologram with polarization-controlled dual images. Nano letters 14, 225–230 (2013). 14. Taghinejad, M. et al. Ultrafast control of phase and polarization of light expedited by hot-electron transfer. Nano letters 18, 5544–5551 (2018). 15. Taghinejad, M. et al. Hot-electron-assisted femtosecond all-optical modulation in plasmonics. Adv. Mater. 30, 1704915 (2018). 16. Seidel, S. Y. & Rappaport, T. S. Site-specific propagation prediction for wireless in-building personal communication system design. IEEE transactions on Veh. Technol. 43, 879–891 (1994). 17. Gondarenko, A. & Lipson, M. Low modal volume dipole-like dielectric slab resonator. Opt. express 16, 17689–17694 (2008). 18. Håkansson, A. & Sánchez-Dehesa, J. Inverse designed photonic crystal de-multiplex waveguide coupler. Opt. Express 13, 5440–5449 (2005). 19. Ong, J. R., Chu, H. S., Chen, V. H., Zhu, A. Y. & Genevet, P. Freestanding dielectric nanohole array metasurface for mid-infrared wavelength applications. Opt. letters 42, 2639–2642 (2017). 20. Piggott, A. Y., Petykiewicz, J., Su, L. & Vuck ˇ ovic, ´ J. Fabrication-constrained nanophotonic inverse design. Sci. Reports 7, 1786 (2017). 21. Lu, J. & Vuck ˇ ovic, ´ J. Nanophotonic computational design. Opt. express 21, 13351–13367 (2013). 22. Su, L., Piggott, A. Y., Sapra, N. V., Petykiewicz, J. & Vuckovic, J. Inverse design and demonstration of a compact on-chip narrowband three-channel wavelength demultiplexer. ACS Photonics 5, 301–305 (2017). 23. Frellsen, L. F., Ding, Y., Sigmund, O. & Frandsen, L. H. Topology optimized mode multiplexing in silicon-on-insulator photonic wire waveguides. Opt. express 24, 16866–16873 (2016). 24. Piggott, A. Y. et al. Inverse design and implementation of a wavelength demultiplexing grating coupler. Sci. reports 4, 7210 (2014). 25. Englund, D., Fushman, I. & Vuckovic, J. General recipe for designing photonic crystal cavities. Opt. express 13, 5961–5975 (2005). 26. Molesky, S. et al. Inverse design in nanophotonics. Nat. Photonics 12, 659 (2018). 27. Mansouree, M. & Arbabi, A. Large-scale metasurface design using the adjoint sensitivity technique. In Conference on Lasers and Electro-Optics, FF1F.7, DOI: 10.1364/CLEO_QELS.2018.FF1F.7 (Optical Society of America, 2018). 28. Melati, D. et al. Mapping the global design space of integrated photonic components using machine learning pattern recognition. arXiv preprint arXiv:1811.01048 (2018). 29. Ma, W., Cheng, F. & Liu, Y. Deep-learning enabled on-demand design of chiral metamaterials. ACS nano (2018). 30. Baxter, J. et al. Plasmonic colours predicted by deep learning. arXiv preprint arXiv:1902.05898 (2019). 14/17 31. Liu, D., Tan, Y., Khoram, E. & Yu, Z. Training deep neural networks for the inverse design of nanophotonic structures. ACS Photonics 5, 1365–1369 (2018). 32. Peurifoy, J. et al. Nanophotonic particle simulation and inverse design using artificial neural networks. Sci. advances 4, eaar4206 (2018). 33. Liu, Z., Zhu, D., Rodrigues, S., Lee, K.-T. & Cai, W. A generative model for the inverse design of metasurfaces. Nano letters (2018). 34. Tahersima, M. H. et al. Deep neural network inverse design of integrated nanophotonic devices. arXiv preprint arXiv:1809.03555 (2018). 35. Zhang, T. et al. Spectrum prediction and inverse design for plasmonic waveguide system based on artificial neural networks. arXiv preprint arXiv:1805.06410 (2018). 36. Qu, Y., Jing, L., Shen, Y., Qiu, M. & Soljacic, M. Migrating knowledge between physical scenarios based on artificial neural networks. arXiv preprint arXiv:1809.00972 (2018). 37. Inampudi, S. & Mosallaei, H. Neural network based design of metagratings. Appl. Phys. Lett. 112, 241102 (2018). 38. Yao, K., Unni, R. & Zheng, Y. Intelligent nanophotonics: merging photonics and artificial intelligence at the nanoscale. arXiv preprint arXiv:1810.11709 (2018). 39. So, S., Mun, J. & Rho, J. Simultaneous inverse design of materials and parameters of core-shell nanoparticle via deep-learning: Demonstration of dipole resonance engineering. arXiv preprint arXiv:1904.02848 (2019). 40. Asano, T. & Noda, S. Optimization of photonic crystal nanocavities based on deep learning. Opt. express 26, 32704–32717 (2018). 41. Peurifoy, J. et al. Nanophotonic particle simulation and inverse design using artificial neural networks. Sci. advances 4, eaar4206 (2018). 42. Liu, Z., Zhu, D., Rodrigues, S. P., Lee, K.-T. & Cai, W. Generative model for the inverse design of metasurfaces. Nano letters 18, 6570–6576 (2018). 43. Qu, Y., Jing, L., Shen, Y., Qiu, M. & Soljacic, M. Migrating knowledge between physical scenarios based on artificial neural networks. arXiv preprint arXiv:1809.00972 (2018). 44. Pearson, K. Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, Dublin Philos. Mag. J. Sci. 2, 559–572 (1901). 45. Schölkopf, B., Smola, A. & Müller, K.-R. Kernel principal component analysis. In International conference on artificial neural networks, 583–588 (Springer, 1997). 46. Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. science 313, 504–507 (2006). 47. Friedman, J., Hastie, T. & Tibshirani, R. The elements of statistical learning, vol. 1 (Springer series in statistics New York, 2001). 48. Kiarashinejad, Y., Abdollahramezani, S. & Adibi, A. Deep learning approach based on dimensionality reduction for designing electromagnetic nanostructures. arXiv preprint arXiv:1902.03865 (2019). 49. Shportko, K. et al. Resonant bonding in crystalline phase-change materials. Nat. materials 7, 653 (2008). 50. Abdollahramezani, S. et al. Reconfigurable multifunctional metasurfaces employing hybrid phase-change plasmonic architecture. arXiv preprint arXiv:1809.08907 (2018). 15/17 Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency(THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Figure S2. Comparison of the reflection spectra of the original and reconstructed data after reducing the dimensionality of the response space employing different methods (PCA, KPCA, and autoencoder) with d = 7. 16/17 Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflectance Amplitude Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Original Data Original Data 1 1 Autoencoder Autoencoder PCA PCA KPCA KPCA 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 160 170 180 190 200 210 220 230 240 160 170 180 190 200 210 220 230 240 Frequency (THz) Frequency (THz) Figure S3. Comparison of the reflection spectra of the original and reconstructed data after reducing the dimensionality of the response space employing different methods (PCA, KPCA, and autoencoder) with d = 2. 17/17 Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude Reflection Amplitude

Journal

StatisticsarXiv (Cornell University)

Published: May 7, 2019

There are no references for this article.