NANO.PTML model for read-across prediction of nanosystems in neurosciences. computational model and experimental case of study

He, Shan; Nader, Karam; Abarrategi, Julen Segura; Bediaga, Harbil; Nocedo-Mena, Deyani; Ascencio, Estefania; Casanola-Martin, Gerardo M.; Castellanos-Rubio, Idoia; Insausti, Maite; Rasulev, Bakhtiyor; Arrasate, Sonia; González-Díaz, Humberto

doi:10.1186/s12951-024-02660-9

Research
Open access
Published: 23 July 2024

NANO.PTML model for read-across prediction of nanosystems in neurosciences. computational model and experimental case of study

Shan He^1,2,3,
Karam Nader²,
Julen Segura Abarrategi²,
Harbil Bediaga³,
Deyani Nocedo-Mena⁴,
Estefania Ascencio^1,2,3,
Gerardo M. Casanola-Martin¹,
Idoia Castellanos-Rubio²,
Maite Insausti^2,5,
Bakhtiyor Rasulev¹,
Sonia Arrasate² &
…
Humberto González-Díaz^2,6,7

Journal of Nanobiotechnology volume 22, Article number: 435 (2024) Cite this article

66 Accesses
Metrics details

Abstract

Neurodegenerative diseases involve progressive neuronal death. Traditional treatments often struggle due to solubility, bioavailability, and crossing the Blood-Brain Barrier (BBB). Nanoparticles (NPs) in biomedical field are garnering growing attention as neurodegenerative disease drugs (NDDs) carrier to the central nervous system. Here, we introduced computational and experimental analysis. In the computational study, a specific IFPTML technique was used, which combined Information Fusion (IF) + Perturbation Theory (PT) + Machine Learning (ML) to select the most promising Nanoparticle Neuronal Disease Drug Delivery (N2D3) systems. For the application of IFPTML model in the nanoscience, NANO.PTML is used. IF-process was carried out between 4403 NDDs assays and 260 cytotoxicity NP assays conducting a dataset of 500,000 cases. The optimal IFPTML was the Decision Tree (DT) algorithm which shown satisfactory performance with specificity values of 96.4% and 96.2%, and sensitivity values of 79.3% and 75.7% in the training (375k/75%) and validation (125k/25%) set. Moreover, the DT model obtained Area Under Receiver Operating Characteristic (AUROC) scores of 0.97 and 0.96 in the training and validation series, highlighting its effectiveness in classification tasks. In the experimental part, two samples of NPs (Fe₃O₄_A and Fe₃O₄_B) were synthesized by thermal decomposition of an iron(III) oleate (FeOl) precursor and structurally characterized by different methods. Additionally, in order to make the as-synthesized hydrophobic NPs (Fe₃O₄_A and Fe₃O₄_B) soluble in water the amphiphilic CTAB (Cetyl Trimethyl Ammonium Bromide) molecule was employed. Therefore, to conduct a study with a wider range of NP system variants, an experimental illustrative simulation experiment was performed using the IFPTML-DT model. For this, a set of 500,000 prediction dataset was created. The outcome of this experiment highlighted certain NANO.PTML systems as promising candidates for further investigation. The NANO.PTML approach holds potential to accelerate experimental investigations and offer initial insights into various NP and NDDs compounds, serving as an efficient alternative to time-consuming trial-and-error procedures.

Introduction

Neurodegenerative Diseases (NDs) constitute a diverse set of conditions marked by the gradual deterioration and loss of neurons in various regions of the nervous system. These diseases pose a significant challenge to global health because their incidence is increasing. With the expansion of the aging population, the World Health Organization anticipates a threefold increase worldwide in the number of individuals affected by neurodegenerative disorders over the coming three decades. Although the precise mechanisms driving NDs are not fully elucidated, researchers suggest a multifaceted interplay involving genetic, epigenetic, and environmental factors. Presently, there are no established treatments capable of slowing, halting, or preventing the progression of any NDs [1, 2]. For example, diseases like Alzheimer´s and Parkinson´s, which have been recognized for over a century, continue to lack a cure [3,4,5]. Some promising lines of research for the treatment of neurodegenerative disorders are: gene therapy [6], development of neuroprotective mimetic peptides [7], repurposing (or reevaluation) of known drugs [8], among others [9].

One challenge is the interaction between NPs and components of the immune system. Over the past ten years, research has demonstrated that although NP can be toxic, advances in nanotechnology have enabled the modification of these materials. These modifications can either prevent interaction with the immune system or specifically target it. When nanoparticles are used for medical purposes that do not aim to activate or suppress the immune system, it is beneficial to avoid any immune system interaction [10]. For instance, NPs can be engineered by coating them with poly(ethylene glycol) (PEG) or other polymers, creating a hydrophilic layer that conceals them from the immune system’s detection [11]. Another challenge to be addressed in the treatment of neurodegenerative disorders lies in the passage of therapeutic agents through the Blood-Brain Barrier (BBB) to reach the Central Nervous System (CNS). To overcome these obstacles, research efforts are directed towards both the development of new drugs and the exploration of innovative drug delivery methods, including targeted nanocarriers [12]. Some of these approaches are: nanobodies [13], nano-antibodies, nano-metal particles (gold, silver, iron oxide) [14] and lipid nanoparticles (nanoliposomes) [15]. These nano-approaches applied to drug R&D as innovative delivery systems for NDs face inherent challenges. Therefore, we find ourselves with arduous experimental work associated with high costs, low stability profiles, short useful lives, and inconsistency between and within production batches [16].

In this sense, Machine Learning (ML) techniques can be useful for analyzing, predicting, and selecting the optimal delivery nano-system to treat neurodegenerative diseases (Nanoparticle Neuronal Disease Drug Delivery systems, in the future “N2D3 systems”). ML has been successfully used for the prediction of biomedical properties of NPs of medical interest. These studies include the influence of particle physicochemical properties on cellular uptake, cytotoxicity, molecular loading, and molecular release, as well as manufacturing properties such as NP size and polydispersity [17, 18]. In the efforts to design new N2D3 systems, a ML algorithm needs to analyze multiple output properties (IC₅₀, K_i, etc.) of a broad range of N2D3 systems with different transported substances (drugs), nanocarriers, coatings, etc., under various conditions such as cell lines and organisms (labels) [19]. On the other hand, Gajewicz et al. [20] have recently discussed the lack and/or dispersion (different sources of information) of nanotoxicity data with special emphasis on the low variety of drugs transported by the current N2D3 systems in contrast to the high number of free drugs assays [21,22,23,24,25]. Consequently, in order face N2D3 systems design problem a ML should be multi-output (able to predict multiple outputs), read-across species (able to infer properties for different species), multi-label (able to consider multiple cell lines, etc.), and able to consider multiple sources of information at the same time. With this purpose our group introduced the Information Fusion (IF) + Perturbation-Theory (PT) + Machine Learning (ML) algorithm. IFPTML gets information from different sources (Drugs assays, NP assays, Proteomics, Metabolic networks, etc.), and carry out an IF process, later uses PT operators to quantify all the variability of the data, and last use ML algorithms to seek a predictive model and predict new N2D3 systems. In the case of specific applications to Nanoscience the algorithm has been called as the NANO.PTML algorithm. NANO.PTML algorithm have been applied successfully before to different types of NP systems [26,27,28,29].

In this paper, firstly we are going to use NANO.PTML algorithm to find a new ML model able to predict new N2D3 systems. Furthermore, in order to illustrate the applicability of the NANO.PTML model in practice we reported an additional computational-experimental case of study. In this case of study, firstly we carried out the synthesis and characterization of two new NPs with potential application in the development of N2D3 systems. Next, we used the NANO.PTML model to carry out a simulation of the outcomes for 500,000 different assays of N2D3 systems based on the two NPs reported. These predictions involve different combinations of up to 123 drugs, 53 cell lines, 16 NP coats, 5 NP core types, 5 NP shapes. The outcome of this experiment serves as guidance for the identification of promising N2D3 systems and gaining insights into their behaviors across different cell lines, coating agents, among others, which could offer valuable guidance for future studies. NANO.PTML model predictions and its experimental validation could offer a promising alternative to traditional trial-and-error methods and pave the way for more efficient N2D3 systems for neurodegenerative diseases.

Materials and methods

Experimental methods

Materials

The products, iron(III) chloride, 1-octadecene, oleic acid, dibenzyl ether, Chloroform and Cetyl Trimethyl Ammonium Bromide (CTAB) were purchased in Sigma-Aldrich. Sodium oleate, ethanol, hexane and tetrahydrofuran were purchased in TCI, PanReac, Honeywell and Emplura, respectively.

Experimental characterization

X-ray diffraction (DRX) patterns of Fe₃O₄ hydrophobic NPs were performed using a PANalytical X’Pert PRO diffractometer equipped with a copper anode (operated at 40 kV and 40 mA), diffracted beam monochromator and PIXcel detector. Scans were collected in the 10 − 90° 2θ range with a step size of 0.02° and scan step speed of 1.25 s. The amount of organic matter in the Fe₃O₄ hydrophobic NPs was determined via thermogravimetric measurements (TGA), performed in a NETZSCH STA 449 C thermogravimetric analyser, by heating 10 mg of dry samples at 10 °C/min in Ar atmosphere. Dynamic Light Scattering (DLS) and Zeta potential (ζ) measurements of the NPs functionalized with CTAB were performed in a Zetasizer Nano-ZS (Malvern Instruments). The measurements were carried out at 25 °C after an equilibrium time of 1 min for 0.05 mg·mL^− 1 Fe₃O₄@CTAB aqueous dispersions. For each sample, 10 runs of 10 s were performed with three repetitions. A Phillips CM200 Transmission Electron Microscopy (TEM) with an accelerating voltage of 200 kV and a point resolution of 0.235 nm was used to analyse the morphology of the samples. Magnetic measurements as a function of the magnetic field M(H) at Room Temperature (RT) were obtained in a Vibrating-Sample Magnetometer (VSM) by measuring the magnetization of the dried hydrophobic nanoparticles and normalizing the magnetization value per unit mass of inorganic matter.

Preparation of Fe₃O₄nanoparticles

Two different Fe₃O₄ nanoparticles (NPs) were synthesized by thermal decomposition of an iron(III) oleate (FeOl) precursor which was previously prepared from iron(III) chloride and sodium oleate mixed in a mixture of solvents (hexane, ethanol and distilled water). The synthesis process of both the FeOl precursor and the Fe₃O₄ NPs was formerly analyzed and optimized throughout different works [30,31,32].

For this work two different samples composed of NPs of similar average dimension (≈ 20 nm) but different morphology (cuboctahedral and octahedral) have been prepared (samples Fe₃O₄_A and Fe₃O₄_B). Sample Fe₃O₄_B was prepared by mixing 10 mmol of the previously prepared FeOl precursor with 20 mL of 1-octadecene, 10 mL of dibenzyl ether and 6.4 mL of oleic acid and heating the mixture until reflux (around 320 ºC). The resulting hydrophobic NPs of Fe₃O₄ coated with oleic acid were washed by centrifugation 3 times (at 9500 rpm) with ethanol and tetrahydrofuran and, finally, they were collected in chloroform and stocked in the fridge 4 ºC. Fe₃O₄_A was similarly prepared, but in this case the synthesis process was scaled to double to analyze the effect of this synthetic parameter in the features of the NPs.

In order to make the as-synthesized hydrophobic NPs (Fe₃O₄_A and Fe₃O₄_B) soluble in water (Fig. 1a) a coating approach based on previously refined protocol was carried out [33]. In this case, instead of using the poly(maleic anhydride-alt-1-octadecene) (PMAO) polymer for the coating the amphiphilic CTAB (Cetyl Trimethyl Ammonium Bromide) molecule was used (Fig. 1b). A CTAB solution in chloroform was added to a 1 mg/mL stock solution of NPs (maintaining a ratio of 100 molecules per nm² of Fe₃O₄ NP surface). After stirring the mixture for 15 min, the solvent was evaporated under vacuum and the nanoparticles were dispersed in chloroform. This process was repeated three times, and the last redispersion was carried out using distilled H₂O. Finally, the two samples functionalized with CTAB (Fe₃O₄_A@CTAB and Fe₃O₄_B@CTAB NPs) were further washed with distilled water (3 times) by centrifugation to remove the excess of CTAB that was not attached to the surface of the NPS. The scheme of the NPs coating process has been displayed in Fig. 1.

Computational methods

In a previous work, we collected three datasets from different databases. The first dataset (Dataset 01) from ChEMBL, with information from preclinical trials of different NDDs, was merged with Dataset 02 built from NP data collected from the literature. As a result, three large subsets (Subset 1, Subset 2, Subset 3) with different variables were obtained, from which the best IFPTML model for the effective N2D3 systems was obtained [34]. In this work we reprocessed all the information with Python algorithms in order to obtain open access code for this problem for the same time. To construct the IFPTML models, we followed the sequential steps outlined in Fig. 2, which illustrates the overall workflow of the computational procedures employed in this study. Additionally, to facilitate comprehension, each step was annotated with a corresponding enumeration (e.g., 2.2.1, 2.2.2).

NP cytotoxicity dataset

Simultaneously, the dataset of preclinical assays for cytotoxicity/ecotoxicity of NPs were collected from 62 papers. (step 2.2.2 in Fig. 2). This dataset contained 260 preclinical assays for 31 NPs, resulting in an average of approximately 8.39 assays per NP. Furthermore, the dataset covered a wide range of NP properties, including morphology, physicochemical properties, coating agents, assay duration, and measurement conditions. These properties were represented as discrete variables (c_nj) used to characterize the conditions and labels of each assay. We categorized all specific conditions of each assay into a general vector c_nj = [c_n1, c_n2, c_n3 ….c_nmax]. These variables were biological activity parameters (c_n0), cell lines utilized in assays (c_n1), NP shapes (c_n2), measurement conditions (c_n3), and coating agents (c_n4). Please see more details about the dataset content in the Supporting Information SI00.docx, 1.1.1. NP cytotoxicity dataset.

NDDs dataset from ChEMBL

At first, 4403 preclinical assays of Neurodegenerative Disease Drugs (NDDs) were downloaded from the ChEMBL database (step 2.2.1. in Fig. 2) [35,36,37]. The dataset comprised 2566 different NDDs, with an average of around 1.71 assays per drug. Additionally, we defined as categorical variables (c_dj) the conditions which covered biological activity parameters (c_d0), target proteins associated with NDDs (c_d1), cell lines used in NDDs assays (c_d2), and organisms involved (c_d3). The nature and quality of the data were also defined as categorical variable, including type of target (c_d4), type of assay (c_d5), data curation (c_d6), confidence score (c_d7), and target mapping (c_d8). Additionally, the database provided molecular descriptors (D_dk = [D_d1, D_d2]) to characterize the chemical structure of NDDs compounds. Specifically, two types of molecular descriptors were used for each compound: the logarithm of the n-Octanol/Water Partition coefficient (LOGP_i) and the Topological Polar Surface Area (PSA_i). Please see more details about the dataset content in the Supporting Information SI00.docx, 1.1.2. NDDs dataset from ChEMBL.

IF process drug nanoparticle delivery system (DNDS) pair resampling

Initially, we utilized the objective value v_ij to formulate the IFPTML model. The IFPTML model involved two types of observed values, denoted as v_ij(c_d0) and v_nj(c_n0), corresponding to both NDDs and NPs. Additionally, we established the target function by employing the descriptor vectors denoted as D_dk (for the drugs) and D_nk (for NPs) as input variables in the AI/ML model. In order to simulate a real experiment with the N2D3 systems system, we prioritize certain properties while reducing others. To do this, we defined the desirability value as d(c_d0) = 1 or d(c_n0) = 1. This value d(c_d0) = 1 when we needed to maximize the value of v_ij(c_d0) or v_nj(c_n0), otherwise d(c_d0) = -1 or d(c_n0) = -1. On the other hand, we used the cutoff to rescale the parameters of v_ij(c_d0) and v_nj(c_n0) to achieve the observed functions f(v_ij(c_d0))_obs and f(v_nj(c_n0))_obs. These values were obtained as: f(v_ij(c_d0))_obs = 1, if v_ij(c_d0) > cutoff and d(c_d0) = 1 or v_ij(c_d0) < cutoff and desirability d(c_d0) = -1, f(v_ij(c_d0)) = 0 otherwise. Please see more details in the Supporting Information SI00.docx, 1.1.3. IF process DNDS pair resampling.

Definition of objectives and reference functions

Another input variables of the IFPTML model is the reference/objective function, defined as f(v_ij(c_d0), v_nj(c_n0))_ref. The f(v_ij(c_d0), v_nj(c_n0))_ref function defines the expected probability f(v_ij(c_d0), v_nj(c_n0))_ref = p(f(v_ij(c_d0), v_nj(c_n0))_ref = 1) of getting the desired activity for a particular property obtained. The reference function f(v_ij(c_d0), v_nj(c_n0))_ref, is calculated as the number of positive outcome n(f(v_ij(c_d0)) = 1) (for drugs) and n(f(v_nj(c_n0)) = 1) (for NPs) divided by the total number of cases for the NDDs and NP systems individually. These functions are characterized as: f(v_ij(c_d0))_ref = p(f(v_ij(c_d0))_ref = 1) = n(f(v_ij(c_d0))_ref = 1)/n(c_n0)_j and f(v_nj(c_n0))_ref = p(f(v_nj(c_n0))_ref = 1) = n(f(v_nj(c_n0))_ref = 1)/n(c_n0)_j. Please see more details in the Supporting Information SI00.docx, 1.1.4. Definition of objectives and reference functions Fig. 3.

PTO calculation

IFPTML N2D3 systems data analysis phases

The dataset in study was formed by structural descriptors vectors denoted as D_nk and D_dk, for each NPs [38,39,40] and NDDs [35, 41,42,43]. Furthermore, we defined assay condition vectors as c_nj and c_dj to denote each label for both NPs and NDDs. For more detail information about the structural descriptors and assay condition vectors, refer to the Supporting Information SI00.docx, 1.1.5. PTO calculation (IFPTML N2D3 systems data analysis phases).

Preprocessing of PT data

The IFPTML study incorporates all vectors c_dj and c_nj, representing the non-numerical experimental conditions and labels for both NDDs and NP preclinical assays. Subsequently, we calculated the Perturbation Theory Operators (PTOs), taking into account the Moving Average (MA) of NDDs and NP (see, Eq. 1 and Eq. 2). The PT initiates with the experimental/observed value of an already known activity and adds the perturbations/variations to the system [26, 27, 44,45,46,47]. For more detail information, refer to the Supporting Information SI00.docx, 1.1.5. PTO calculation (Preprocessing of PT data).

$$\Delta {\rm{D}}\left( {{{\rm{D}}_{{\rm{dk}}}}} \right) = {{\rm{D}}_{{\rm{dk}}}} - \left\langle {{{\left( {{{\rm{D}}_{{\rm{dk}}}}} \right)}_{{{\bf{c}}_{{\rm{dj}}}}}}} \right\rangle$$

(1)

$$\Delta {\rm{D}}\left( {{{\rm{D}}_{{\rm{nk}}}}} \right) = {{\rm{D}}_{{\rm{nk}}}} - \left\langle {{{\left( {{{\rm{D}}_{{\rm{nk}}}}} \right)}_{{{\bf{c}}_{{\rm{nj}}}}}}} \right\rangle$$

(2)

NANO.PTML models training and validation overview

In developing the model using ML techniques, each sample case is categorized into either the training (subset = t) or validation (subset = v) series. The assignment process of cases should be random, representative, and stratified [48, 49]. Subsequently, we divided the cases into three equal parts for subset = t (training) and one-quarter for subset = v (validation) for the whole dataset. It is important to note that the 75% and 25% proportion kept between training and validation [48]. Additionally, the performance of the NANO-PTML models was evaluated using different statistical metrics, particularly Sensitivity (Sn) and Specificity (Sp) [50, 51]. For more detail information, refer to the Supporting Information SI00.docx, 1.1.5. PTO calculation (NANO.PTML models training and validation overview).

NANO.PTML simulation of experimental case of study

We conducted a computational analysis to illustrate the applicability of the NANO.PTML model in an example of a real wet-laboratory setting. In this context, we predicted the Fe₃O₄-core based NPs with CTAB as the coating system, as reported in the experimental part here. To create a more ambitious prediction experiment, we added multiple combinations of Fe-based cores, coatings, cell lines, and shapes. Particularly, this prediction dataset was formed by diverse combinations of up to 123 drugs, 53 cell lines, 16 coats, 5 NPs core and 5 NP shapes. The NPs core studied were CoFe₂O₄, ZnFe₂O₄, Fe₃O₄, Fe₂O₃ and Fe. Additionally, the cell lines used in the cytotoxicity predictive study were L929 (M), HepG2 (H), A549 (H), among other. On the other hand, the organisms used in the eco-toxicity computational study were Vibrio fischeri, Oryzias latipes (embryos), etc. Furthermore, there were different NP shapes such as irregular, elliptical, etc. Finally, the NP coating agents studied in this research were Polyvinyl alcohol (PVA), Polyvinylpyrrolidone (PVP), CTAB, potato starch (PS), PEG-Si(OMe)₃ (PEG), etc. For more detail information of simulation experiment, refer to the Supporting Information SI00.docx, 1.1.5. PTO calculation (NANO.PTML simulation of experimental case of study).

Results and discussion

AI/ML python computational models

In order to design AI/ML models for predicting the NP system as a neurodegenerative drug carrier, the Scikit-Learn module in Python [52] was used to identify the best AI/ML estimator. In this context, linear and non-linear classifiers were employed, specifically, Linear Discriminant Analysis (LDA) [53], Decision Tree (DT) [54], Random Forest (RF) [55], k-Nearest Neighbor (kNN) [56], and Gradient Boosting (GB) [57]. Additionally, the Expert-Guided Selection (EGS) [34] approach was employed to identify the most significant variables capable of defining the NANO.PTML system. The variables utilized for these models were considered crucial for describing the NANO.PTML system: ΔDPSA(c_I)_dj (deviation of topological Polar Surface Area) for neurodegenerative drug and for NPs as drug carrier the variables including ΔDt(c_III)_nj (deviation of NP safety time), ΔDLnp(c_III)_nj (deviation of NP length), ΔDVnpu(c_III)_nj (deviation of NP core volume), ΔDVxcoat(c_III)_nj (deviation of McGowan volume), and ΔDVvdwMGcoat(c_III)_nj (deviation of van der Waals volume from McGowan volume) were taken into account. Table 1; Fig. 4 presented the statistical parameters obtained by linear and non-linear models. The results showed that the DT classifier exhibited a good fit in both the training and validation sets, with Specificity (Sp) values of 96.4/96.2 and Sensitivity (Sn) values of 79.3/75.7, respectively. Another important statistical parameter included is the Mathew’s Correlation Coefficient (MCC) values [58], giving 0.6722/0.6401 in training/validation series.

Table 1 Statistical parameters used for NANO.PTML models

Full size table

After tuning the hyperparameters to develop the DT algorithm which play a crucial role in determining its performance and behavior [59]. The best combination found were the following; The ccp-alpha parameter, set to 0.0, controls the complexity of the tree by correcting excessive branching and preventing overfitting. The class-weight parameter assigns weights to different classes within the dataset, in this case we set class 0 at 40% and class 1 at 60%, addressing potential imbalances in class distribution. The choice of criterion as “gini” indicates the use of Gini impurity as the measure of split quality, influencing how the tree partitions the feature space. Furthermore, max-depth is set to 15, limiting the depth of the tree to prevent it from growing overly complex and overfitting to the training data. The max-features and max-leaf-nodes parameters, both set to “None”, which allow the tree to explore all available features and leaf node possibilities, respectively, without imposing additional constraints. The min-impurity-decrease set at 0.0 defines the minimum impurity decrease required for a split, regulating the tree’s growth. The min-samples-leaf and min-samples-split, both set to 5 and 2 respectively. These parameters establish the minimum number of samples required in a leaf node or for a node split, contributing to the ability of generalizing the tree and avoiding it from being overly specific to the training data. The min-weight-fraction leaf was set to 0.0, indicating that it was not applied, while the random-state was set to 42, ensuring reproducibility of results across different runs of the model. Finally, splitter as “best” indicates that the best split at each node is determined based on the chosen criterion, enabling optimal tree construction. Further information about these parameters can be found in the documentation provided by the Scikit-learn library [52]. The hyperparameter used for LDA, kNN, etc. can be found in Table S1 Supporting Information SI00.docx.

Figure 5 depicts the structure of the decision tree, comprising 3249 nodes with a depth of 15 layers and terminating in 1625 leaf nodes. Final predictions or decisions are made based on the input data [60]. To facilitate better understanding of this tree plot, we have focused the explanation on a tree depth of 2 layers, resulting in 4 leaf nodes, which collectively form 7 main families. This analysis involved input variables such as ΔDVvdwMGcoat(c_III)_nj, f(v_ij(c_d0),v_nj (c_n0))_ref, and ΔDLnp(c_III)_nj. Full information of the description for each family can be seen in Table 2. For example, in family i, composed by NPs with lower McGowan volume deviation than Families v-vii and lower prior probability of activity than families ii-iv.

Table 2 Description of the 7 main families within the DT structure. The color of each family consistently matches that depicted in Fig. 4

Full size table

Overall, this implies smaller NPs, possibly with lower polarizability, and lower expected biological property values suggesting overall reduced drug-NP activity likelihood. The 0.4% of cases are predicted as class 1. Consequently, NPs in this family should not be short-list for assay according to the DT model. However, on the right section of the DT, family ii, composed by NPs with higher McGowan volume deviation than Family i and lower prior probability of activity than families iii and iv. General, this indicates larger NPs, possibly with higher polarizability, and lower expected biological property values for Drug and NP suggesting overall increased activity likelihood. The 1.5% of cases are predicted as class 1. Therefore, NPs in this family should not be short-list for assay according to the model. However, families iii and iv yielded more promising results, with 4% and 3.3% of class 1, respectively. Family iii suggests smaller NPs, possibly with lower polarizability and low to medium expected biological property values, indicating an overall reduced likelihood of drug-NP activity. Conversely, family iv suggests larger NPs with higher polarizability. Medium to high biological property values indicate a higher likelihood of drug-NP activity.

Another statistical metric used in this study is the Area Under Receiver Operating Characteristic (AUROC), for both training and validation set, see Fig. 6 [48]. A high AUROC value indicates better overall performance of the model in terms of its ability to correctly classify instances from both classes. An AUROC of 1.0 represents a perfect classifier, while an AUROC of 0.5 indicates a classifier that performs no better than random guessing [48]. The highest AUROC values, 0.97 − 0.96, are obtained by the DT algorithm, which accordingly matches the results of Sn/Sp in the training/validation set. Whereas, the LDA algorithm is not among the top-performing classifiers, with AUROC values ranging from 0.73 to 0.74.

Contrast with earlier AI/ML algorithms

Other research jobs have showed in the recent investigation a wide variety of problems relating with NPs and/or NDDs discovery, see Table 3. Actually, the majority of these researches explore the cytotoxicity of NP assays or NDDs against a large number of species by applying NANO.PTML models. Nevertheless, to the best of our knowledge, there are not study that includes both NP and NDDs component simultaneously or the opportunity of developing N2D3 systems. For example, Kleandrova et al. developed an combined QSTR-perturbation model to simultaneously explore ecotoxicity and cytotoxicity of NPs under different experimental conditions, including diverse measures of toxicities, multiple biological targets, compositions, sizes and conditions to measure those sizes, shapes, times during which the biological targets were exposed to NPs, and coating agents [44]. The model was obtained from 36,488 cases of NP-NP pairs. Nevertheless, in this research Kelandrova et al. is only restricted to the study of ecotoxicity and cytotoxicity of NPs and does not contemplate the data about NDDs components. Similarly, Cordeiro et al. built up the QSAR-perturbation model which involves 5520 cases (NP–NP pairs). The aim of this model is the simultaneous prediction of the ecotoxicity of NPs against several assay organisms (bio-indicators), by considering also multiple measures of ecotoxicity, as well as the chemical compositions, sizes, conditions under which the sizes were measured, shapes, and the time during which the diverse assay organisms were exposed to nanoparticles [40]. As the previous model, they do not take into account the NDDs biological activity. On the other hand, Luan et al. generated the mx-QSAR model from 4915 cases of multiple assays of neurotoxicity/neuroprotective effects of drugs. In addition, the model was trained with a dataset which involved diverse assay endpoints of 2217 compounds. Each compound was assayed in at least one out of 338 assays, which included 148 molecular or cellular targets and 35 standard type measures in 11 model organisms (including human).Unlike previous models, this mx-QSAR algorithm contained information NDDs, however, it does not consider the NP as part of this system [61]. In this paper, we developed an innovative system including both NP and NDDs components. The results of the NANO.PTML-DT was quite satisfactory Sp values of 96.4/96.2 and Sn values of 79.3/75.7 in training and validation series including 375 K and 125 K cases, respectively. Other research with similar scope as the present work, García et al. built up the LDA linear model in order to predict the results of 42 different experimental tests for GSK-3 inhibitors with heterogeneous structural patterns. GSK-3β inhibitors are interesting candidates for developing anti-Alzheimer compounds among others urgent diseases. These authors obtained Sn/Sp ≈ 90% in training/validation series [62]. On the other hand, Ferreira da Costa et al. constructed LDA model so as to predict the properties of a query compound or molecular system in experimental assays with multiple boundary conditions involved in the dopamine pathway. They obtained Sn/Sp ≈ 70–91% in both training and validation series [63]. However, it is worth mentioning that the contract of statistical parameters between the model of this work and the previous one is not informative at all due to the fact that the design of each model is specific to the problem to be dealt with.

Table 3 NDDs and NP cytotoxicity study using AI/ML algorisms in previous research works

Full size table

Experimental study of new system

Characterization of Fe₃O₄nanoparticles

Initially the hydrophobic NPs (samples Fe₃O₄_A and Fe₃O₄_B) have been structurally, morphologically and magnetically characterized (Table 4). Both samples present the inverse spinel structure of magnetite (Fe₃O₄, S.G. Fd-3 m) with no traces of secondary phases. The crystallite sizes of the samples were calculated from the maximum diffraction peak (311) of X-ray powder diffraction patterns using Scherrer’s equation. The calculated crystallite sizes of the two samples are around 24 nm and are compatible with the average physical size determined by TEM analysis (see Table 4; Fig. 7). The rather good agreement between the two techniques (DRX and TEM) indicate that the NPs of both samples are composed of single nanocrystals. In relation to the morphology of the NPs, sample Fe₃O₄_A is composed of NPs with more facets (cuboctahedrons), while the NPs of sample Fe₃O₄_B present octahedral-like shape as it can be seen in Fig. 7a) and b), respectively.

Table 4 Summary of the features of the two Fe₃O₄ NP samples: Weight% of the organic matter (O.M.%) in the as-synthesized hydrophobic NPs, size of crystalline domain (D_DXR) by Scherrer calculation from the main (311) diffraction peak, the average dimension of the inorganic core obtained by TEM (D_TEM), saturation magnetization (M_S) of the inorganic core at RT and the hydrodynamic size (D_H) and Z potential (Z) of the hydrophilic NPs coated with CTAB

Full size table

The magnetization dependence with the magnetic field (M(H)) in the two samples has been carried out by DC Magnetometry at RT. The M(H) curves of Fig. 7d display saturation magnetizations (M_S) of 88 and 91 Am²/kg_Fe3O4, respectively, which proves the high quality of the magnetite phase and the purity of the inorganic core. After coating the hydrophobic NPs with CTAB, both samples (Fe₃O₄_A@CTAB and Fe₃O₄_B@CTAB NPs) become highly soluble in water as it is shown by the Z potential values, which are positive due to the cationic nature of the CTAB molecule (see Table 4; Fig. 7b). Regarding the degree of agglomeration of the NPs in water dispersion, it can be claimed that these NPs are arranged in small clusters (2–5 NPs) because they present moderate hydrodynamic diameters (see Table 4) in comparison to the average diameter of a single NP determined by DRX and TEM.

This experimental section is focused specifically on the NP core of Fe₃O₄ with two shapes (cuboctahedral and octahedral) and on the CTAB coating. We performed a computational analysis to demonstrate the practical application of the NANO.PTML model using a real-world wet-laboratory scenario. Additionally, we carried out a simulation experiment that try to mimic this experimental part. For this purpose, we created a prediction dataset with various combinations of NP systems including NP cores, coating agents, cell lines, shapes, and anti-neurodegenerative drugs linked with certain coatings. It is important to note that the total number of combinations, considering NP cores, cell lines, shapes, coating agents, and anti-neurodegenerative drugs, amounted to N_tot = n(NP cores) · n(cell lines) · n(NP shapes) · n(NP coats) · n(drugs) = 5 · 53 · 5 · 16 · 123 = 2,607,600 assays. Performing all these combinations in a wet-laboratory is impractical, time-consuming, and resource-intensive. Even with expert criteria, the number of assays remains unmanageable for study. Therefore, the NANO.PTML-DT approach is introduced to address this issue by reducing the number of assays and serving as a guide for the experimental part, highlighting the most promising combinations within the NP systems as drug carriers for neurodegenerative diseases.

Experimental vs. computational illustrative case of study

NANO.PTML-DT simulation experiment

In this section, a computational case study was presented to simulate the Fe₃O₄_A@CTAB and Fe₃O₄_B@CTAB NPs from the experimental study detailed in this paper (Fig. 8). The aim of this simulation experiment was to forecast the best combination of the NPs core vs. cell lines (cytotoxicity or ecotoxicity) vs. shapes vs. coating agents as mentioned in the previous section. In this scenarios, we created a total of 500,000 assays as new prediction dataset, which was formed by up to n(NPs core) = 5, n(cn1 = cell lines) = 53, n(cn2 = NP shapes) = 5, n(NPs coat) = 16 and n(drugs) = 123.

On the other hand, the DT model was selected due to the good performance of the statistical parameters in both training and validation set, as shown in Table 1. The probability p(NANO.PTML_in)_cnj values were acquired with NANO.PTML_in system. The heatmap shown in Fig. 9 illustrates the findings using a 3-color scale based on probability values: the green zone represents a high probability range, the yellow zone signifies a moderate to low probability range, and the red zone indicates a very low predicted probability. Assays that had never been reported previously or had very low representation in the original dataset, as well as insignificant combinations of NP systems were depicted in white to prevent overestimation in the results. Additionally, the columns of this heatmap represented the NP core, cell lines, and NP morphology. The column for cell lines was further categorized into cytotoxicity and ecotoxicity. The rows of the heatmap corresponded to the NP coats studied, arranged based on their MacGowan volume_n values. Furthermore, the heatmap contained information regarding the frequency of each combination appearing in both the columns and rows within the prediction dataset.

The prediction was carried out taking into account the cytotoxicity and eco-toxicity. It is crucial the study of the cytotoxicity as NPs are increasingly employed in medical diagnostics and therapies to enhance our comprehension, detection, and treatment of human diseases. The exposure of NPs in consumer products or their use in emerging biomedical applications, such as drug delivery, biosensors, [69] or imaging agents, [70] entails direct ingestion or injection into the body [71]. Additionally, the study of eco-toxicity is critical for assessing their impact on ecosystems, wildlife, and human health [72]. It helps in understanding how NPs interact with the environment, entering food chains and potentially affecting biodiversity.

In this context, the outcomes of the DT model highlighted certain NANO.PTML systems as promising candidates for further investigation. Interestingly, the high prediction value of Lycopersicon esculentum proved to be a favorable ecotoxicity cell line, exhibiting high probability values with the majority of coating systems. Contrarily, the least propitious cell lines were Danio rerio (embryos), Danio rerio (juvenile), Danio rerio (adults), Oryzias latipes (adults), Ceriodaphnia dubia (neonates), Daphnia pulex (adults), Chlorella sp., and Scenedesmus sp., which yielding in medium to low probability values. On the other hand, one more important characteristic is MacGowan volume which has been widely used in many areas to estimate the physicochemical and biochemical properties of molecules, [73, 74] specifically for CTAB, PS, and PEG as coating agents, with an exception in PVA. The combination of elliptical-shaped NPs with PVA as a coating agent in cytotoxicity cell lines appears to be a promising candidate for further synthesis. Another important factor is the type of the cell line which obtained higher probability value with cytotoxicity. It is important to note that all predictions generated by this method should be approached with caution and necessitate experimental validation. The NANO.PTML-DT method holds potential for accelerating experimental studies and offering cost-effective preliminary results for a vast database of NANO.PTML systems. This methodology presents an effective and robust tool for guiding experimental research, offering an alternative to laborious trial-and-error testing.

General applications of NANO.PTML-DT model

The NANO.PTML model has different types of applications in various stages of N2D3 system development, as shown in Fig. 10. It includes the selection of new cores, coats, or drugs. In all these cases, the N2D3 systems can be optimized in terms of drug activity and NP system safety (cytotoxicity and ecotoxicity). The first three applications involve the selection of input variables. In the NP core scanning stage, researchers can select different types, sizes, and shapes. In the NP coats scanning stage, they can select up to 16 coating agents, such as CTAB, PVA, PVP, etc. In the drug scanning stage, they can carry out NDDs synthesis modifications, repurposing, and patent greening. The synthesis modifications refer to the prediction of new N2D3 systems (different coats, cores) for new drug structures with potential NDDs activity [75]. Repurposing refers to the prediction of new NDDs for N2D3 systems from already known drugs with other activities [76]. Patent greening applications refer to the prediction of new N2D3 systems (different coats, cores) for already known NDDs [77]. In all these cases, the outcomes predicted by the NANO.PTML model can optimize NP safety and/or biological activity of NDDs. To make these predictions, we have to change the values of different input variables. In Fig. 10, we highlighted the input variables that need to be changed to make predictions for different applications. For details about the variables, see AI/ML Python Computational Models section. The variables in these four stages can be changed one by one according to the researchers’ needs; however, they can also be changed simultaneously. For example, in the simulation experiment shown in Fig. 9, we created a total of 500,000 assays in which up to 123 drugs, 53 cell lines, 16 NP coats, 5 NP shapes, and 5 NP cores were changed at the same time.

Conclusions

The NANO.PTML model, which integrates NDDs and NP models, offers a practical solution for developing new NP system as drug carriers for neurodegenerative diseases. It effectively addresses the challenge of exploring numerous NP and NDDs compound combinations. The best-performing AI/ML model, using the DT algorithm, achieved high Sp (96.4%/96.2%) and Sn (79.3%/75.7%) in training and validation, with AUROC values of 0.97 and 0.96. Chemically synthesized Fe₃O₄ NPs were structurally characterized and coated with CTAB to enhance water solubility. We illustrated an example of the IFPTML-DT model application in a real experiment (reported here). To do this, we performed an experimental simulation using a large prediction dataset including 500,000 cases/empirical experiments similar to NPs studied in the experimental part. This simulation experiment showed that certain NP systems as promising candidate for further investigation, highlighting the Lycopersicon esculentum cell line for ecotoxicity studies according green section of Fig. 9. The MacGowan volume was significant for certain coating agents (CTAB, PS, PEG) but not for PVA. Overall, the NANO.PTML model expedites experimental research and provides reliable initial findings, reducing the reliance on time-consuming wet-lab procedures.

Data availability

The datasets generated and/or analyzed during the current study are available in the Figshare repository, DOI: 10.6084/m9.figshare.25450291. On the other hand, the code of the NANO.PTML models was uploaded to a GitHub repository and is available free for use by researchers. For the NANO.PTML models code the link is: https://github.com/she012/NANO.PTML-project.

Abbreviations

AI:: Artificial Intelligence refers to the use of computers to simulate intelligent behavior with minimal human involvement [78]
AUROC:: Area Under Receiver Operating Characteristic is commonly used to assess the accuracy of diagnostic tests. A ROC curve that is closer to the upper left corner of the graph indicates higher test accuracy, as this point represents a sensitivity of 1 and a false positive rate of 0 (specificity of 1) [79]
BBB:: Blood-Brain Barrier is a selective permeability barrier formed by the blood vessels that vascularize the central nervous system [80]
CNS:: Central Nervous System is the brain and spinal cord. This system is responsible of receiving, processing, and responding to sensory information [81]
CTAB:: Cetyl Trimethyl Ammonium Bromide is a cationic surfactant
DLS:: Dynamic Light Scattering measures changes in scattering intensity over time caused by particles moving randomly due to Brownian motion [82]
DNDS:: Drug Nanoparticle Delivery System
DRX:: X Ray Diffraction has become a widely used method for examining crystal structures and atomic spacing [83]
DT:: Decision Tree is a hierarchical decision support model that employs a tree-like structure to represent decisions and their potential outcomes [54]
GB:: Gradient Boosting is an ensemble machine learning technique that sequentially combines the predictions of multiple weak learners, usually decision trees [57]
IF:: Information Fusion is the process of integrating data from different sources
kNN:: K-Nearest Neighbors is a popular machine learning technique which based on the principle that data points which are close to each other are likely to share similar labels or values [84].
LDA:: Linear Discriminant Analyses is a machine learning method that aims to identify a linear combination of features that effectively distinguishes between two or more classes of objects or events [85]
NDs:: Neurodegenerative Diseases
MA:: Moving Average is a method used to analyze data points by calculating a series of average values from various subsets of the entire dataset [86]
MCC:: Mathew’s Correlation Coefficient is a metric for binary classification that evaluates predictions by considering true positives, true negatives, false positives, and false negatives[87
ML:: Machine Learning is the science of developing algorithms and statistical models that enable computer systems to perform tasks without explicit instructions, relying instead on patterns and inferences [88]
NP:: Nanoparticle
NDDs:: Neurodegenerative Disease Drugs
N2D3:: Nanoparticle Neuronal Disease Drug Delivery systems
PMAO:: Poly(Maleic Anhydride-alt-1-Octadecene)
PT:: Perturbation Theory involves starting with a simple system for which a mathematical solution is already known. Then, an additional perturbation is introduced to represent a weak disturbance to the system [89]
PTOs:: Perturbation Theory Operators is the linear and non-linear transformations of moving average. For example, the deviation of the moving average
RF:: Radom Forest is a popular machine learning technique that combines the collective predictions of numerous decision trees to generate a combined outcome [90]
RT:: Room Temperature
TEM:: Transmission Electron Microscopy is a method of microscopy where a specimen is illuminated with a beam of electrons, allowing the formation of an image as the electrons pass through the specimen [91]
TGA:: ThermoGravimetric Analysis is a method used to assess the thermal stability of materials, including polymers, by measuring changes in weight as a function of temperature [92]
VSM:: Vibrating-Sample Magnetometer is a device used to measure the magnetic properties of a sample while it undergoes perpendicular vibrations within a uniform magnetic field [93]

References

Asefy Z, Hoseinnejhad S, Ceferov Z. Nanoparticles approaches in neurodegenerative diseases diagnosis and treatment. Neurol Sci. 2021;42:2653–60.
Article PubMed Google Scholar
Agnello L, Ciaccio M. Neurodegenerative diseases: from molecular basis to Therapy. Volume 23. pp. 12854: MDPI; 2022. p. 12854.
Fahn S. The 200-year journey of Parkinson disease: reflecting on the past and looking towards the future. Parkinsonism Relat Disord. 2018;46:S1–5.
Article PubMed Google Scholar
Knopman DS, Amieva H, Petersen RC, Chételat G, Holtzman DM, Hyman BT, Nixon RA, Jones DT. Alzheimer disease. Nat Reviews Disease Primers. 2021;7:33.
Article PubMed Google Scholar
Abramov AY, Berezhnov AV, Fedotova EI, Zinchenko VP, Dolgacheva LP. Interaction of misfolded proteins and mitochondria in neurodegenerative disorders. Biochem Soc Trans. 2017;45:1025–33.
Article CAS PubMed Google Scholar
Zakharova M. Modern approaches in gene therapy of motor neuron diseases. Med Res Rev. 2021;41:2634–55.
Article CAS PubMed Google Scholar
Myasoedov NF, Lyapina LA, Andreeva LA, Grigorieva ME, Obergan TY, Shubina TA. The modern view on the role of glyprolines by metabolic syndrome. Med Res Rev. 2021;41:2823–40.
Article CAS PubMed Google Scholar
Kukharsky MS, Skvortsova VI, Bachurin SO, Buchman VL. In a search for efficient treatment for amyotrophic lateral sclerosis: old drugs for new approaches. Med Res Rev. 2021;41:2804–22.
Article PubMed Google Scholar
Gudasheva TA, Povarnina PY, Tarasiuk AV, Seredenin SB. Low-molecular mimetics of nerve growth factor and brain‐derived neurotrophic factor: design and pharmacological properties. Med Res Rev. 2021;41:2746–74.
Article CAS PubMed Google Scholar
Dobrovolskaia MA, Shurin M, Shvedova AA. Current understanding of interactions between nanoparticles and the immune system. Toxicol Appl Pharmacol. 2016;299:78–89.
Article CAS PubMed Google Scholar
Moghimi SM. Chemical camouflage of nanospheres with a poorly reactive surface: towards development of stealth and target-specific nanocarriers. Biochim et Biophys Acta -Molecular Cell Res. 2002;1590:131–9.
Article CAS Google Scholar
Delche NA, Kheiri R, Nejad BG, Sheikhi M, Razavi MS, Rahimzadegan M, Salmasi Z. Recent progress in the intranasal PLGA-based drug delivery for neurodegenerative diseases treatment. Iran J Basic Med Sci. 2023;26:1107.
Google Scholar
Steeland S, Vandenbroucke RE, Libert C. Nanobodies as therapeutics: big opportunities for small antibodies. Drug Discovery Today. 2016;21:1076–113.
Article CAS PubMed Google Scholar
Vio V, Jose Marchant M, Araya E, Kogan J. Metal nanoparticles for the treatment and diagnosis of neurodegenerative brain diseases. Curr Pharm Design. 2017;23:1916–26.
Article CAS Google Scholar
Shah R, Eldridge D, Palombo E, Harding I. Lipid nanoparticles: production, characterization and stability. Springer; 2015.
Elzahhar P, Belal AS, Elamrawy F, Helal NA, Nounou MI. Bioconjugation in drug delivery: practical perspectives and future perceptions. Pharm Nanotechnology: Basic Protocols 2019:125–82.
Jones DE, Ghandehari H, Facelli JC. Comput Methods Programs Biomed. 2016;132:93–103. A review of the applications of data mining and machine learning for the prediction of biomedical properties of nanoparticles.
Sayes C, Ivanov I. Comparative study of predictive computational models for nanoparticle-induced cytotoxicity. Risk Analysis: Int J. 2010;30:1723–34.
Article Google Scholar
Makadia HK, Siegel SJ. Poly lactic-co-glycolic acid (PLGA) as biodegradable controlled drug delivery carrier. Polymers. 2011;3:1377–97.
Article CAS PubMed Google Scholar
Gajewicz A. What if the number of nanotoxicity data is too small for developing predictive Nano-QSAR models? An alternative read-across based approach for filling data gaps. Nanoscale. 2017;9:8435–48.
Article CAS PubMed Google Scholar
Oksel C, Ma CY, Liu JJ, Wilkins T, Wang XZ. (Q) SAR modelling of nanomaterial toxicity: a critical review. Particuology. 2015;21:1–19.
Article CAS Google Scholar
Winkler DA, Mombelli E, Pietroiusti A, Tran L, Worth A, Fadeel B, McCall MJ. Applying quantitative structure–activity relationship approaches to nanotoxicology: current status and future potential. Toxicology. 2013;313:15–23.
Article CAS PubMed Google Scholar
Gajewicz A, Rasulev B, Dinadayalane TC, Urbaszek P, Puzyn T, Leszczynska D, Leszczynski J. Advancing risk assessment of engineered nanomaterials: application of computational approaches. Adv Drug Deliv Rev. 2012;64:1663–93.
Article CAS PubMed Google Scholar
Tantra R, Oksel C, Puzyn T, Wang J, Robinson KN, Wang XZ, Ma CY, Wilkins T. Nano (Q) SAR: challenges, pitfalls and perspectives. Nanotoxicology. 2015;9:636–42.
Article CAS PubMed Google Scholar
Oksel C, Ma CY, Liu JJ, Wilkins T, Wang XZ. Literature review of (Q) SAR modelling of nanomaterial toxicity. Modelling Toxic Nanopart 2017:103–42.
Santana R, Zuluaga R, Ganan P, Arrasate S, Onieva E, Gonzalez-Diaz H. Designing nanoparticle release systems for drug-vitamin cancer co-therapy with multiplicative perturbation-theory machine learning (PTML) models. Nanoscale. 2019;11:21811–23.
Article CAS PubMed Google Scholar
Santana R, Zuluaga R, Ganan P, Arrasate S, Onieva E, Gonzalez-Diaz H. Predicting coated-nanoparticle drug release systems with perturbation-theory machine learning (PTML) models. Nanoscale. 2020;12:13471–83.
Article CAS PubMed Google Scholar
Dieguez-Santana K, Gonzalez-Diaz H. Towards machine learning discovery of dual antibacterial drug-nanoparticle systems. Nanoscale. 2021;13:17854–70.
Article CAS PubMed Google Scholar
Ortega-Tenezaca B, Gonzalez-Diaz H. IFPTML mapping of nanoparticle antibacterial activity vs. pathogen metabolic networks. Nanoscale. 2021;13:1318–30.
Article CAS PubMed Google Scholar
Castellanos-Rubio I, Munshi R, Qin Y, Eason DB, Orue I, Insausti M, Pralle A. Multilayered inorganic–organic microdisks as ideal carriers for high magnetothermal actuation: assembling ferrimagnetic nanoparticles devoid of dipolar interactions. Nanoscale. 2018;10:21879–92.
Article CAS PubMed PubMed Central Google Scholar
Castellanos-Rubio I, Arriortua O, Iglesias-Rojas D, Barón A, Rodrigo I, Marcano L, Garitaonandia JS, Orue Ia, Fdez-Gubieda ML, Insausti M. A milestone in the chemical synthesis of Fe3O4 nanoparticles: unreported bulklike properties lead to a remarkable magnetic hyperthermia. Chem Mater. 2021;33:8693–704.
Article CAS PubMed PubMed Central Google Scholar
Nader K, Castellanos-Rubio I, Orue I, Iglesias-Rojas D, Barón A, de Muro IG, Lezama L, Insausti M. Getting insight into how iron (III) oleate precursors affect the features of magnetite nanoparticles. J Solid State Chem. 2022;316:123619.
Article CAS Google Scholar
Castellanos-Rubio I, Rodrigo I, Olazagoitia-Garmendia A, Arriortua O, Gil de Muro I, Garitaonandia JS, Bilbao JRn, Fdez-Gubieda ML, Plazaola F, Orue Ia. Highly reproducible hyperthermia response in water, agar, and cellular environment by discretely PEGylated magnetite nanoparticles. ACS Appl Mater Interfaces. 2020;12:27917–29.
Article CAS PubMed PubMed Central Google Scholar
He S, Abarrategi JS, Bediaga H, Arrasate S, González-Díaz H. On Additive Artificial Intelligence Discovery of Nanoparticle-Neurodegenerative Disease Drug Delivery Systems. Beilstein Archives 2024; 2024:10.
Bento AP, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, Kruger FA, Light Y, Mak L, McGlinchey S, et al. The ChEMBL bioactivity database: an update. Nucleic Acids Res. 2014;42:D1083–1090.
Article CAS PubMed Google Scholar
Davies M, Nowotka M, Papadatos G, Dedman N, Gaulton A, Atkinson F, Bellis L, Overington JP. ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015;43:W612–620.
Article CAS PubMed PubMed Central Google Scholar
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40:D1100–1107.
Article CAS PubMed Google Scholar
Concu R, Kleandrova VV, Speck-Planche A, Cordeiro M. Probing the toxicity of nanoparticles: a unified in silico machine learning model based on perturbation theory. Nanotoxicology. 2017;11:891–906.
Article CAS PubMed Google Scholar
Luan F, Kleandrova VV, Gonzalez-Diaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MN. Computer-aided nanotoxicology: assessing cytotoxicity of nanoparticles under diverse experimental conditions by using a novel QSTR-perturbation approach. Nanoscale. 2014;6:10623–30.
Article CAS PubMed Google Scholar
Kleandrova VV, Luan F, Gonzalez-Diaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MN. Computational ecotoxicology: simultaneous prediction of ecotoxic effects of nanoparticles under different experimental conditions. Environ Int. 2014;73:288–94.
Article CAS PubMed Google Scholar
Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D, Mutowo P, Atkinson F, Bellis LJ, Cibrian-Uhalte E, et al. The ChEMBL database in 2017. Nucleic Acids Res. 2017;45:D945–54.
Article CAS PubMed Google Scholar
Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Felix E, Magarinos MP, Mosquera JF, Mutowo P, Nowotka M, et al. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 2019;47:D930–40.
Article CAS PubMed Google Scholar
Moriwaki H, Tian Y-S, Kawashita N, Takagi T. Mordred: a molecular descriptor calculator. J Cheminform. 2018;10:1–14.
Article Google Scholar
Kleandrova VV, Luan F, Gonzalez-Diaz H, Ruso JM, Speck-Planche A, Cordeiro MN. Computational tool for risk assessment of nanomaterials: novel QSTR-perturbation model for simultaneous prediction of ecotoxicity and cytotoxicity of uncoated and coated nanoparticles under multiple experimental conditions. Environ Sci Technol. 2014;48:14686–94.
Article CAS PubMed Google Scholar
Luan F, Kleandrova VV, González-Díaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MNDS. Computer-aided nanotoxicology: assessing cytotoxicity of nanoparticles under diverse experimental conditions by using a novel QSTR-perturbation approach. Nanoscale. 2014;6:10623–30.
Article CAS PubMed Google Scholar
Santana R, Zuluaga R, Ganan P, Arrasate S, Onieva E, Montemore MM, Gonzalez-Diaz H. PTML Model for Selection of Nanoparticles, anticancer drugs, and vitamins in the design of drug-vitamin nanoparticle Release systems for Cancer Cotherapy. Mol Pharm. 2020;17:2612–27.
Article CAS PubMed Google Scholar
Urista DV, Carrue DB, Otero I, Arrasate S, Quevedo-Tumailli VF, Gestal M, Gonzalez-Diaz H, Munteanu CR. Prediction of Antimalarial Drug-decorated nanoparticle Delivery systems with Random Forest models. Biology (Basel) 2020; 9.
Hill T, Lewicki P. Statistics: Methods and Applications. 1st edition edn: StatSoft, Inc.; 2005.
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40:D1100–7.
Article CAS PubMed Google Scholar
Huberty CJ, Olejnik S. Applied MANOVA and discriminant analysis. 2nd ed. Hoboken, New Jersey: John Wiley & Sons, Inc.; 2006.
Book Google Scholar
Hanczar B, Hua J, Sima C, Weinstein J, Bittner M, Dougherty ER. Small-sample precision of ROC-related estimates. Bioinformatics. 2010;26:822–30.
Article CAS PubMed Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.
Google Scholar
Anowar F, Sadaoui S, Selim B. Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, Le, Ica, t-sne). Comput Sci Rev. 2021;40:100378.
Article Google Scholar
Kotsiantis SB. Decision trees: a recent overview. Artif Intell Rev. 2013;39:261–83.
Article Google Scholar
Breiman L. Random forests. Mach Learn. 2001;45:5–32.
Article Google Scholar
Peterson LE. K-nearest neighbor. Scholarpedia. 2009;4:1883.
Article Google Scholar
Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat 2001:1189–232.
Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21:1–13.
Article Google Scholar
Feurer M, Springenberg J, Hutter F. Initializing bayesian hyperparameter optimization via meta-learning. In Proceedings of the AAAI Conference on Artificial Intelligence. 2015.
Swain PH, Hauska H. The decision tree classifier: design and potential. IEEE Trans Geoscience Electron. 1977;15:142–7.
Article Google Scholar
Luan F, Cordeiro MN, Alonso N, Garcia-Mera X, Caamano O, Romero-Duran FJ, Yanez M, Gonzalez-Diaz H. TOPS-MODE model of multiplexing neuroprotective effects of drugs and experimental-theoretic study of new 1,3-rasagiline derivatives potentially useful in neurodegenerative diseases. Bioorg Med Chem. 2013;21:1870–9.
Article CAS PubMed Google Scholar
Garcia I, Fall Y, Gomez G, Gonzalez-Diaz H. First computational chemistry multi-target model for anti-Alzheimer, anti-parasitic, anti-fungi, and anti-bacterial activity of GSK-3 inhibitors in vitro, in vivo, and in different cellular lines. Mol Divers. 2011;15:561–7.
Article CAS PubMed Google Scholar
Ferreira da Costa J, Silva D, Caamaño O, Brea JM, Loza MI, Munteanu CR, Pazos A, García-Mera X, Gonzalez-Diaz H. Perturbation theory/machine learning model of ChEMBL data for dopamine targets: docking, synthesis, and assay of new l-prolyl-l-leucyl-glycinamide peptidomimetics. ACS Chem Neurosci. 2018;9:2572–87.
Article CAS PubMed Google Scholar
Kar S, Gajewicz A, Puzyn T, Roy K, Leszczynski J. Periodic table-based descriptors to encode cytotoxicity profile of metal oxide nanoparticles: a mechanistic QSTR approach. Ecotoxicol Environ Saf. 2014;107:162–9.
Article CAS PubMed Google Scholar
Jagiello K, Grzonkowska M, Swirog M, Ahmed L, Rasulev B, Avramopoulos A, Papadopoulos MG, Leszczynski J, Puzyn T. Advantages and limitations of classic and 3D QSAR approaches in nano-QSAR studies based on biological activity of fullerene derivatives. J Nanopart Res. 2016;18:1–16.
Article CAS Google Scholar
Mikolajczyk A, Gajewicz A, Mulkiewicz E, Rasulev B, Marchelek M, Diak M, Hirano S, Zaleska-Medynska A, Puzyn T. Nano-QSAR modeling for ecosafe design of heterogeneous TiO 2-based nano-photocatalysts. Environ Science: Nano. 2018;5:1150–60.
CAS Google Scholar
Durán FJR, Alonso N, Caamaño O, García-Mera X, Yañez M. Multi-Target Prediction of Neuroprotective Drugs, Synthesis, Assay, and Theoretical Study of Rasagiline Carbamates. 2015.
Romero-Duran FJ, Alonso N, Yanez M, Caamano O, García-Mera X, Gonzalez-Diaz H. Brain-inspired cheminformatics of drug-target brain interactome, synthesis, and assay of TVP1022 derivatives. Neuropharmacology. 2016;103:270–8.
Article CAS PubMed Google Scholar
Doria G, Conde J, Veigas B, Giestas L, Almeida C, Assunção M, Rosa J, Baptista PV. Noble metal nanoparticles for biosensing applications. Sensors. 2012;12:1657–87.
Article CAS PubMed PubMed Central Google Scholar
Rabin O, Manuel Perez J, Grimm J, Wojtkiewicz G, Weissleder R. An X-ray computed tomography imaging agent based on long-circulating bismuth sulphide nanoparticles. Nat Mater. 2006;5:118–22.
Article CAS PubMed Google Scholar
Lewinski N, Colvin V, Drezek R. Cytotoxicity of nanoparticles. Small. 2008;4:26–49.
Article CAS PubMed Google Scholar
Handy RD, Owen R, Valsami-Jones E. The ecotoxicology of nanoparticles and nanomaterials: current status, knowledge gaps, challenges, and future needs. Ecotoxicology. 2008;17:315–25.
Article CAS PubMed Google Scholar
Abraham MH, Chadha HS, Martins F, Mitchell RC, Bradbury MW, Gratton JA. Hydrogen bonding part 46: a review of the correlation and prediction of transport properties by an LFER method: physicochemical properties, brain penetration and skin permeability. Pest Sci. 1999;55:78–88.
CAS Google Scholar
Abraham MH, McGowan J. The use of characteristic volumes to measure cavity terms in reversed phase liquid chromatography. Chromatographia. 1987;23:243–6.
Article CAS Google Scholar
Shamsi J, Urban AS, Imran M, De Trizio L, Manna L. Metal halide perovskite nanocrystals: synthesis, post-synthesis modifications, and their optical properties. Chem Rev. 2019;119:3296–348.
Article CAS PubMed PubMed Central Google Scholar
Pushpakom S, Iorio F, Eyers PA, Escott KJ, Hopper S, Wells A, Doig A, Guilliams T, Latimer J, McNamee C. Drug repurposing: progress, challenges and recommendations. Nat Rev Drug Discovery. 2019;18:41–58.
Article CAS PubMed Google Scholar
Du K, Li P, Yan Z. Do green technology innovations contribute to carbon dioxide emission reduction? Empirical evidence from patent data. Technological Forecast Social Change. 2019;146:297–303.
Article Google Scholar
Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69:S36–40.
Article CAS Google Scholar
Nahm FS. Receiver operating characteristic curve: overview and practical use for clinicians. Korean J Anesthesiology. 2022;75:25.
Article Google Scholar
Daneman R, Prat A. The blood–brain barrier. Cold Spring Harb Perspect Biol. 2015;7:a020412.
Article PubMed PubMed Central Google Scholar
Brodal P. The central nervous system. Oxford University Press; 2010.
Kaszuba M, McKnight D, Connah MT, McNeil-Watson FK, Nobbmann U. Measuring sub nanometre sizes using dynamic light scattering. J Nanopart Res. 2008;10:823–9.
Article CAS Google Scholar
Bunaciu AA, UdriŞTioiu EG, Aboul-Enein HY. X-ray diffraction: instrumentation and applications. Crit Rev Anal Chem. 2015;45:289–99.
Article CAS PubMed Google Scholar
Kramer O, Kramer O. K-nearest neighbors. Dimensionality Reduct Unsupervised Nearest Neighbors 2013:13–23.
Balakrishnama S, Ganapathiraju A. Linear discriminant analysis-a brief tutorial. Inst Signal Inform Process. 1998;18:1–8.
Google Scholar
Hunter JS. The exponentially weighted moving average. J Qual Technol. 1986;18:203–10.
Article Google Scholar
Chicco D, Jurman GJBM. The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. 2023; 16:4.
Zhou Z-H. Machine learning. Springer nature; 2021.
Stevenson PM. Optimized perturbation theory. Phys Rev D. 1981;23:2916.
Article CAS Google Scholar
Biau G, Scornet E. A random forest guided tour. Test. 2016;25:197–227.
Article Google Scholar
Zuo JM, Spence JC. Advanced transmission electron microscopy. Springer; 2017.
Coats A, Redfern J. Thermogravimetric analysis. A review. Analyst. 1963;88:906–24.
Article CAS Google Scholar
Foner S. Versatile and sensitive vibrating-sample magnetometer. Rev Sci Instrum. 1959;30:548–57.
Article Google Scholar

Download references

Acknowledgements

Mrs. Arrate Bañeres is acknowledged for management support.

Funding

We are grateful to receive the financial support from grants Basque Government / Eusko Jaurlaritza (IT1558-22), SPRI ELKARTEK grants AIMOFGIF (KK-2022/00032), Ministry of Science and Innovation (PID2022-137365NB-I00), and Eusko Jaurlaritza, LANBIDE, INVESTIGO Grants, IKERDATA 2022/IKER/000040 funded by NextGenerationEU funds of European Commission. We also want to acknowledge the Spanish Ministry of Science and Innovation for financial support under grant No. PID2022- 136993OB-I00 (AEI/FEDER, UE), funded by MCIN/AEI/ 10.13039/ 501100011033 and, as appropriate, by “ERDF A way of making Europe”, by the “European Union”. This work was also supported in part by the U.S. Department of Energy (DOE) Grant DE-SC0022239. The work used resources of the Center for Computationally-Assisted Science and Technology (CCAST) at North Dakota State University (Fargo, ND USA), which was made possible in part by the U.S. National Science Foundation (NSF) MRI Award No. 2019077.

Author information

Authors and Affiliations

Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND, 58108, USA
Shan He, Estefania Ascencio, Gerardo M. Casanola-Martin & Bakhtiyor Rasulev
Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, Leioa, 48940, Spain
Shan He, Karam Nader, Julen Segura Abarrategi, Estefania Ascencio, Idoia Castellanos-Rubio, Maite Insausti, Sonia Arrasate & Humberto González-Díaz
IKERDATA S.L., ZITEK, UPV/EHU, Rectorate Building, nº 6, Leioa, 48940, Greater Bilbao, Basque Country, Spain
Shan He, Harbil Bediaga & Estefania Ascencio
Faculty of Physical Mathematical Sciences, Autonomous University of Nuevo León, San Nicolás de los Garza, 66455, Nuevo León, México
Deyani Nocedo-Mena
BCMaterials, Basque Center for Materials, Applications and Nanostructures, Leioa, 48940, Spain
Maite Insausti
BIOFISIKA: Basque Center for Biophysics CSIC, University of The Basque Country (UPV/EHU), Barrio Sarriena s/n, Leioa, 48940, Bizkaia, Basque Country, Spain
Humberto González-Díaz
IKERBASQUE, Basque Foundation for Science, Bilbao, 48011, Biscay, Spain
Humberto González-Díaz

Authors

Shan He
View author publications
You can also search for this author in PubMed Google Scholar
Karam Nader
View author publications
You can also search for this author in PubMed Google Scholar
Julen Segura Abarrategi
View author publications
You can also search for this author in PubMed Google Scholar
Harbil Bediaga
View author publications
You can also search for this author in PubMed Google Scholar
Deyani Nocedo-Mena
View author publications
You can also search for this author in PubMed Google Scholar
Estefania Ascencio
View author publications
You can also search for this author in PubMed Google Scholar
Gerardo M. Casanola-Martin
View author publications
You can also search for this author in PubMed Google Scholar
Idoia Castellanos-Rubio
View author publications
You can also search for this author in PubMed Google Scholar
Maite Insausti
View author publications
You can also search for this author in PubMed Google Scholar
Bakhtiyor Rasulev
View author publications
You can also search for this author in PubMed Google Scholar
Sonia Arrasate
View author publications
You can also search for this author in PubMed Google Scholar
Humberto González-Díaz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

ICR, MI, SA, and HGD conceived the presented idea. JSA collected the dataset. SH, GMCM, BR, SA and HGD implemented the idea computationally, performed the computations and analysis. KN, ICR and MI performed the nanoparticle synthesis experiments. GMCM, ICR, MI, BR, SA and HGD supervised the findings of this work. All authors discussed the results and wrote the manuscript with input of all authors. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Idoia Castellanos-Rubio or Sonia Arrasate.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

He, S., Nader, K., Abarrategi, J.S. et al. NANO.PTML model for read-across prediction of nanosystems in neurosciences. computational model and experimental case of study. J Nanobiotechnol 22, 435 (2024). https://doi.org/10.1186/s12951-024-02660-9

Download citation

Received: 18 April 2024
Accepted: 24 June 2024
Published: 23 July 2024
DOI: https://doi.org/10.1186/s12951-024-02660-9

NANO.PTML model for read-across prediction of nanosystems in neurosciences. computational model and experimental case of study

Abstract

Introduction

Materials and methods

Experimental methods

Materials

Experimental characterization

Preparation of Fe3O4nanoparticles

Computational methods

NP cytotoxicity dataset

NDDs dataset from ChEMBL

IF process drug nanoparticle delivery system (DNDS) pair resampling

Definition of objectives and reference functions

PTO calculation

IFPTML N2D3 systems data analysis phases

Preprocessing of PT data

NANO.PTML models training and validation overview

NANO.PTML simulation of experimental case of study

Results and discussion

AI/ML python computational models

Contrast with earlier AI/ML algorithms

Experimental study of new system

Characterization of Fe3O4nanoparticles

Experimental vs. computational illustrative case of study

NANO.PTML-DT simulation experiment

General applications of NANO.PTML-DT model

Conclusions

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Journal of Nanobiotechnology

Contact us

Preparation of Fe₃O₄nanoparticles

Characterization of Fe₃O₄nanoparticles