Skip to main content

NANO.PTML model for read-across prediction of nanosystems in neurosciences. computational model and experimental case of study

Abstract

Neurodegenerative diseases involve progressive neuronal death. Traditional treatments often struggle due to solubility, bioavailability, and crossing the Blood-Brain Barrier (BBB). Nanoparticles (NPs) in biomedical field are garnering growing attention as neurodegenerative disease drugs (NDDs) carrier to the central nervous system. Here, we introduced computational and experimental analysis. In the computational study, a specific IFPTML technique was used, which combined Information Fusion (IF) + Perturbation Theory (PT) + Machine Learning (ML) to select the most promising Nanoparticle Neuronal Disease Drug Delivery (N2D3) systems. For the application of IFPTML model in the nanoscience, NANO.PTML is used. IF-process was carried out between 4403 NDDs assays and 260 cytotoxicity NP assays conducting a dataset of 500,000 cases. The optimal IFPTML was the Decision Tree (DT) algorithm which shown satisfactory performance with specificity values of 96.4% and 96.2%, and sensitivity values of 79.3% and 75.7% in the training (375k/75%) and validation (125k/25%) set. Moreover, the DT model obtained Area Under Receiver Operating Characteristic (AUROC) scores of 0.97 and 0.96 in the training and validation series, highlighting its effectiveness in classification tasks. In the experimental part, two samples of NPs (Fe3O4_A and Fe3O4_B) were synthesized by thermal decomposition of an iron(III) oleate (FeOl) precursor and structurally characterized by different methods. Additionally, in order to make the as-synthesized hydrophobic NPs (Fe3O4_A and Fe3O4_B) soluble in water the amphiphilic CTAB (Cetyl Trimethyl Ammonium Bromide) molecule was employed. Therefore, to conduct a study with a wider range of NP system variants, an experimental illustrative simulation experiment was performed using the IFPTML-DT model. For this, a set of 500,000 prediction dataset was created. The outcome of this experiment highlighted certain NANO.PTML systems as promising candidates for further investigation. The NANO.PTML approach holds potential to accelerate experimental investigations and offer initial insights into various NP and NDDs compounds, serving as an efficient alternative to time-consuming trial-and-error procedures.

Introduction

Neurodegenerative Diseases (NDs) constitute a diverse set of conditions marked by the gradual deterioration and loss of neurons in various regions of the nervous system. These diseases pose a significant challenge to global health because their incidence is increasing. With the expansion of the aging population, the World Health Organization anticipates a threefold increase worldwide in the number of individuals affected by neurodegenerative disorders over the coming three decades. Although the precise mechanisms driving NDs are not fully elucidated, researchers suggest a multifaceted interplay involving genetic, epigenetic, and environmental factors. Presently, there are no established treatments capable of slowing, halting, or preventing the progression of any NDs [1, 2]. For example, diseases like Alzheimer´s and Parkinson´s, which have been recognized for over a century, continue to lack a cure [3,4,5]. Some promising lines of research for the treatment of neurodegenerative disorders are: gene therapy [6], development of neuroprotective mimetic peptides [7], repurposing (or reevaluation) of known drugs [8], among others [9].

One challenge is the interaction between NPs and components of the immune system. Over the past ten years, research has demonstrated that although NP can be toxic, advances in nanotechnology have enabled the modification of these materials. These modifications can either prevent interaction with the immune system or specifically target it. When nanoparticles are used for medical purposes that do not aim to activate or suppress the immune system, it is beneficial to avoid any immune system interaction [10]. For instance, NPs can be engineered by coating them with poly(ethylene glycol) (PEG) or other polymers, creating a hydrophilic layer that conceals them from the immune system’s detection [11]. Another challenge to be addressed in the treatment of neurodegenerative disorders lies in the passage of therapeutic agents through the Blood-Brain Barrier (BBB) to reach the Central Nervous System (CNS). To overcome these obstacles, research efforts are directed towards both the development of new drugs and the exploration of innovative drug delivery methods, including targeted nanocarriers [12]. Some of these approaches are: nanobodies [13], nano-antibodies, nano-metal particles (gold, silver, iron oxide) [14] and lipid nanoparticles (nanoliposomes) [15]. These nano-approaches applied to drug R&D as innovative delivery systems for NDs face inherent challenges. Therefore, we find ourselves with arduous experimental work associated with high costs, low stability profiles, short useful lives, and inconsistency between and within production batches [16].

In this sense, Machine Learning (ML) techniques can be useful for analyzing, predicting, and selecting the optimal delivery nano-system to treat neurodegenerative diseases (Nanoparticle Neuronal Disease Drug Delivery systems, in the future “N2D3 systems”). ML has been successfully used for the prediction of biomedical properties of NPs of medical interest. These studies include the influence of particle physicochemical properties on cellular uptake, cytotoxicity, molecular loading, and molecular release, as well as manufacturing properties such as NP size and polydispersity [17, 18]. In the efforts to design new N2D3 systems, a ML algorithm needs to analyze multiple output properties (IC50, Ki, etc.) of a broad range of N2D3 systems with different transported substances (drugs), nanocarriers, coatings, etc., under various conditions such as cell lines and organisms (labels) [19]. On the other hand, Gajewicz et al. [20] have recently discussed the lack and/or dispersion (different sources of information) of nanotoxicity data with special emphasis on the low variety of drugs transported by the current N2D3 systems in contrast to the high number of free drugs assays [21,22,23,24,25]. Consequently, in order face N2D3 systems design problem a ML should be multi-output (able to predict multiple outputs), read-across species (able to infer properties for different species), multi-label (able to consider multiple cell lines, etc.), and able to consider multiple sources of information at the same time. With this purpose our group introduced the Information Fusion (IF) + Perturbation-Theory (PT) + Machine Learning (ML) algorithm. IFPTML gets information from different sources (Drugs assays, NP assays, Proteomics, Metabolic networks, etc.), and carry out an IF process, later uses PT operators to quantify all the variability of the data, and last use ML algorithms to seek a predictive model and predict new N2D3 systems. In the case of specific applications to Nanoscience the algorithm has been called as the NANO.PTML algorithm. NANO.PTML algorithm have been applied successfully before to different types of NP systems [26,27,28,29].

In this paper, firstly we are going to use NANO.PTML algorithm to find a new ML model able to predict new N2D3 systems. Furthermore, in order to illustrate the applicability of the NANO.PTML model in practice we reported an additional computational-experimental case of study. In this case of study, firstly we carried out the synthesis and characterization of two new NPs with potential application in the development of N2D3 systems. Next, we used the NANO.PTML model to carry out a simulation of the outcomes for 500,000 different assays of N2D3 systems based on the two NPs reported. These predictions involve different combinations of up to 123 drugs, 53 cell lines, 16 NP coats, 5 NP core types, 5 NP shapes. The outcome of this experiment serves as guidance for the identification of promising N2D3 systems and gaining insights into their behaviors across different cell lines, coating agents, among others, which could offer valuable guidance for future studies. NANO.PTML model predictions and its experimental validation could offer a promising alternative to traditional trial-and-error methods and pave the way for more efficient N2D3 systems for neurodegenerative diseases.

Materials and methods

Experimental methods

Materials

The products, iron(III) chloride, 1-octadecene, oleic acid, dibenzyl ether, Chloroform and Cetyl Trimethyl Ammonium Bromide (CTAB) were purchased in Sigma-Aldrich. Sodium oleate, ethanol, hexane and tetrahydrofuran were purchased in TCI, PanReac, Honeywell and Emplura, respectively.

Experimental characterization

X-ray diffraction (DRX) patterns of Fe3O4 hydrophobic NPs were performed using a PANalytical X’Pert PRO diffractometer equipped with a copper anode (operated at 40 kV and 40 mA), diffracted beam monochromator and PIXcel detector. Scans were collected in the 10 − 90° 2θ range with a step size of 0.02° and scan step speed of 1.25 s. The amount of organic matter in the Fe3O4 hydrophobic NPs was determined via thermogravimetric measurements (TGA), performed in a NETZSCH STA 449 C thermogravimetric analyser, by heating 10 mg of dry samples at 10 °C/min in Ar atmosphere. Dynamic Light Scattering (DLS) and Zeta potential (ζ) measurements of the NPs functionalized with CTAB were performed in a Zetasizer Nano-ZS (Malvern Instruments). The measurements were carried out at 25 °C after an equilibrium time of 1 min for 0.05 mg·mL− 1 Fe3O4@CTAB aqueous dispersions. For each sample, 10 runs of 10 s were performed with three repetitions. A Phillips CM200 Transmission Electron Microscopy (TEM) with an accelerating voltage of 200 kV and a point resolution of 0.235 nm was used to analyse the morphology of the samples. Magnetic measurements as a function of the magnetic field M(H) at Room Temperature (RT) were obtained in a Vibrating-Sample Magnetometer (VSM) by measuring the magnetization of the dried hydrophobic nanoparticles and normalizing the magnetization value per unit mass of inorganic matter.

Preparation of Fe3O4nanoparticles

Two different Fe3O4 nanoparticles (NPs) were synthesized by thermal decomposition of an iron(III) oleate (FeOl) precursor which was previously prepared from iron(III) chloride and sodium oleate mixed in a mixture of solvents (hexane, ethanol and distilled water). The synthesis process of both the FeOl precursor and the Fe3O4 NPs was formerly analyzed and optimized throughout different works [30,31,32].

For this work two different samples composed of NPs of similar average dimension (≈ 20 nm) but different morphology (cuboctahedral and octahedral) have been prepared (samples Fe3O4_A and Fe3O4_B). Sample Fe3O4_B was prepared by mixing 10 mmol of the previously prepared FeOl precursor with 20 mL of 1-octadecene, 10 mL of dibenzyl ether and 6.4 mL of oleic acid and heating the mixture until reflux (around 320 ºC). The resulting hydrophobic NPs of Fe3O4 coated with oleic acid were washed by centrifugation 3 times (at 9500 rpm) with ethanol and tetrahydrofuran and, finally, they were collected in chloroform and stocked in the fridge 4 ºC. Fe3O4_A was similarly prepared, but in this case the synthesis process was scaled to double to analyze the effect of this synthetic parameter in the features of the NPs.

In order to make the as-synthesized hydrophobic NPs (Fe3O4_A and Fe3O4_B) soluble in water (Fig. 1a) a coating approach based on previously refined protocol was carried out [33]. In this case, instead of using the poly(maleic anhydride-alt-1-octadecene) (PMAO) polymer for the coating the amphiphilic CTAB (Cetyl Trimethyl Ammonium Bromide) molecule was used (Fig. 1b). A CTAB solution in chloroform was added to a 1 mg/mL stock solution of NPs (maintaining a ratio of 100 molecules per nm2 of Fe3O4 NP surface). After stirring the mixture for 15 min, the solvent was evaporated under vacuum and the nanoparticles were dispersed in chloroform. This process was repeated three times, and the last redispersion was carried out using distilled H2O. Finally, the two samples functionalized with CTAB (Fe3O4_A@CTAB and Fe3O4_B@CTAB NPs) were further washed with distilled water (3 times) by centrifugation to remove the excess of CTAB that was not attached to the surface of the NPS. The scheme of the NPs coating process has been displayed in Fig. 1.

Fig. 1
figure 1

(a) Hydrophobic Fe3O4 NPs coated with oleic acid, and (b) hydrophilic Fe3O4 NPs coated with oleic acid and a layer of CTAB

Computational methods

In a previous work, we collected three datasets from different databases. The first dataset (Dataset 01) from ChEMBL, with information from preclinical trials of different NDDs, was merged with Dataset 02 built from NP data collected from the literature. As a result, three large subsets (Subset 1, Subset 2, Subset 3) with different variables were obtained, from which the best IFPTML model for the effective N2D3 systems was obtained [34]. In this work we reprocessed all the information with Python algorithms in order to obtain open access code for this problem for the same time. To construct the IFPTML models, we followed the sequential steps outlined in Fig. 2, which illustrates the overall workflow of the computational procedures employed in this study. Additionally, to facilitate comprehension, each step was annotated with a corresponding enumeration (e.g., 2.2.1, 2.2.2).

Fig. 2
figure 2

IFPTML detailed information-processing workflow. Step 2.2.1 and 2.2.2 Data collection. Step 2.2.3 Data pre-processing and Information Fusion (NP and NDDs assay). Step 2.2.4 Objective and reference functions definition. Step 2.2.5 PTO calculation

NP cytotoxicity dataset

Simultaneously, the dataset of preclinical assays for cytotoxicity/ecotoxicity of NPs were collected from 62 papers. (step 2.2.2 in Fig. 2). This dataset contained 260 preclinical assays for 31 NPs, resulting in an average of approximately 8.39 assays per NP. Furthermore, the dataset covered a wide range of NP properties, including morphology, physicochemical properties, coating agents, assay duration, and measurement conditions. These properties were represented as discrete variables (cnj) used to characterize the conditions and labels of each assay. We categorized all specific conditions of each assay into a general vector cnj = [cn1, cn2, cn3 ….cnmax]. These variables were biological activity parameters (cn0), cell lines utilized in assays (cn1), NP shapes (cn2), measurement conditions (cn3), and coating agents (cn4). Please see more details about the dataset content in the Supporting Information SI00.docx, 1.1.1. NP cytotoxicity dataset.

NDDs dataset from ChEMBL

At first, 4403 preclinical assays of Neurodegenerative Disease Drugs (NDDs) were downloaded from the ChEMBL database (step 2.2.1. in Fig. 2) [35,36,37]. The dataset comprised 2566 different NDDs, with an average of around 1.71 assays per drug. Additionally, we defined as categorical variables (cdj) the conditions which covered biological activity parameters (cd0), target proteins associated with NDDs (cd1), cell lines used in NDDs assays (cd2), and organisms involved (cd3). The nature and quality of the data were also defined as categorical variable, including type of target (cd4), type of assay (cd5), data curation (cd6), confidence score (cd7), and target mapping (cd8). Additionally, the database provided molecular descriptors (Ddk = [Dd1, Dd2]) to characterize the chemical structure of NDDs compounds. Specifically, two types of molecular descriptors were used for each compound: the logarithm of the n-Octanol/Water Partition coefficient (LOGPi) and the Topological Polar Surface Area (PSAi). Please see more details about the dataset content in the Supporting Information SI00.docx, 1.1.2. NDDs dataset from ChEMBL.

IF process drug nanoparticle delivery system (DNDS) pair resampling

Initially, we utilized the objective value vij to formulate the IFPTML model. The IFPTML model involved two types of observed values, denoted as vij(cd0) and vnj(cn0), corresponding to both NDDs and NPs. Additionally, we established the target function by employing the descriptor vectors denoted as Ddk (for the drugs) and Dnk (for NPs) as input variables in the AI/ML model. In order to simulate a real experiment with the N2D3 systems system, we prioritize certain properties while reducing others. To do this, we defined the desirability value as d(cd0) = 1 or d(cn0) = 1. This value d(cd0) = 1 when we needed to maximize the value of vij(cd0) or vnj(cn0), otherwise d(cd0) = -1 or d(cn0) = -1. On the other hand, we used the cutoff to rescale the parameters of vij(cd0) and vnj(cn0) to achieve the observed functions f(vij(cd0))obs and f(vnj(cn0))obs. These values were obtained as: f(vij(cd0))obs = 1, if vij(cd0) > cutoff and d(cd0) = 1 or vij(cd0) < cutoff and desirability d(cd0) = -1, f(vij(cd0)) = 0 otherwise. Please see more details in the Supporting Information SI00.docx, 1.1.3. IF process DNDS pair resampling.

Definition of objectives and reference functions

Another input variables of the IFPTML model is the reference/objective function, defined as f(vij(cd0), vnj(cn0))ref. The f(vij(cd0), vnj(cn0))ref function defines the expected probability f(vij(cd0), vnj(cn0))ref = p(f(vij(cd0), vnj(cn0))ref = 1) of getting the desired activity for a particular property obtained. The reference function f(vij(cd0), vnj(cn0))ref, is calculated as the number of positive outcome n(f(vij(cd0)) = 1) (for drugs) and n(f(vnj(cn0)) = 1) (for NPs) divided by the total number of cases for the NDDs and NP systems individually. These functions are characterized as: f(vij(cd0))ref = p(f(vij(cd0))ref = 1) = n(f(vij(cd0))ref = 1)/n(cn0)j and f(vnj(cn0))ref = p(f(vnj(cn0))ref = 1) = n(f(vnj(cn0))ref = 1)/n(cn0)j. Please see more details in the Supporting Information SI00.docx, 1.1.4. Definition of objectives and reference functions Fig. 3.

Fig. 3
figure 3

Workflow for the observed and reference function definition (see details in supporting information)

PTO calculation

IFPTML N2D3 systems data analysis phases

The dataset in study was formed by structural descriptors vectors denoted as Dnk and Ddk, for each NPs [38,39,40] and NDDs [35, 41,42,43]. Furthermore, we defined assay condition vectors as cnj and cdj to denote each label for both NPs and NDDs. For more detail information about the structural descriptors and assay condition vectors, refer to the Supporting Information SI00.docx, 1.1.5. PTO calculation (IFPTML N2D3 systems data analysis phases).

Preprocessing of PT data

The IFPTML study incorporates all vectors cdj and cnj, representing the non-numerical experimental conditions and labels for both NDDs and NP preclinical assays. Subsequently, we calculated the Perturbation Theory Operators (PTOs), taking into account the Moving Average (MA) of NDDs and NP (see, Eq. 1 and Eq. 2). The PT initiates with the experimental/observed value of an already known activity and adds the perturbations/variations to the system [26, 27, 44,45,46,47]. For more detail information, refer to the Supporting Information SI00.docx, 1.1.5. PTO calculation (Preprocessing of PT data).

$$\Delta {\rm{D}}\left( {{{\rm{D}}_{{\rm{dk}}}}} \right) = {{\rm{D}}_{{\rm{dk}}}} - \left\langle {{{\left( {{{\rm{D}}_{{\rm{dk}}}}} \right)}_{{{\bf{c}}_{{\rm{dj}}}}}}} \right\rangle$$
(1)
$$\Delta {\rm{D}}\left( {{{\rm{D}}_{{\rm{nk}}}}} \right) = {{\rm{D}}_{{\rm{nk}}}} - \left\langle {{{\left( {{{\rm{D}}_{{\rm{nk}}}}} \right)}_{{{\bf{c}}_{{\rm{nj}}}}}}} \right\rangle$$
(2)
NANO.PTML models training and validation overview

In developing the model using ML techniques, each sample case is categorized into either the training (subset = t) or validation (subset = v) series. The assignment process of cases should be random, representative, and stratified [48, 49]. Subsequently, we divided the cases into three equal parts for subset = t (training) and one-quarter for subset = v (validation) for the whole dataset. It is important to note that the 75% and 25% proportion kept between training and validation [48]. Additionally, the performance of the NANO-PTML models was evaluated using different statistical metrics, particularly Sensitivity (Sn) and Specificity (Sp) [50, 51]. For more detail information, refer to the Supporting Information SI00.docx, 1.1.5. PTO calculation (NANO.PTML models training and validation overview).

NANO.PTML simulation of experimental case of study

We conducted a computational analysis to illustrate the applicability of the NANO.PTML model in an example of a real wet-laboratory setting. In this context, we predicted the Fe3O4-core based NPs with CTAB as the coating system, as reported in the experimental part here. To create a more ambitious prediction experiment, we added multiple combinations of Fe-based cores, coatings, cell lines, and shapes. Particularly, this prediction dataset was formed by diverse combinations of up to 123 drugs, 53 cell lines, 16 coats, 5 NPs core and 5 NP shapes. The NPs core studied were CoFe2O4, ZnFe2O4, Fe3O4, Fe2O3 and Fe. Additionally, the cell lines used in the cytotoxicity predictive study were L929 (M), HepG2 (H), A549 (H), among other. On the other hand, the organisms used in the eco-toxicity computational study were Vibrio fischeri, Oryzias latipes (embryos), etc. Furthermore, there were different NP shapes such as irregular, elliptical, etc. Finally, the NP coating agents studied in this research were Polyvinyl alcohol (PVA), Polyvinylpyrrolidone (PVP), CTAB, potato starch (PS), PEG-Si(OMe)3 (PEG), etc. For more detail information of simulation experiment, refer to the Supporting Information SI00.docx, 1.1.5. PTO calculation (NANO.PTML simulation of experimental case of study).

Results and discussion

AI/ML python computational models

In order to design AI/ML models for predicting the NP system as a neurodegenerative drug carrier, the Scikit-Learn module in Python [52] was used to identify the best AI/ML estimator. In this context, linear and non-linear classifiers were employed, specifically, Linear Discriminant Analysis (LDA) [53], Decision Tree (DT) [54], Random Forest (RF) [55], k-Nearest Neighbor (kNN) [56], and Gradient Boosting (GB) [57]. Additionally, the Expert-Guided Selection (EGS) [34] approach was employed to identify the most significant variables capable of defining the NANO.PTML system. The variables utilized for these models were considered crucial for describing the NANO.PTML system: ΔDPSA(cI)dj (deviation of topological Polar Surface Area) for neurodegenerative drug and for NPs as drug carrier the variables including ΔDt(cIII)nj (deviation of NP safety time), ΔDLnp(cIII)nj (deviation of NP length), ΔDVnpu(cIII)nj (deviation of NP core volume), ΔDVxcoat(cIII)nj (deviation of McGowan volume), and ΔDVvdwMGcoat(cIII)nj (deviation of van der Waals volume from McGowan volume) were taken into account. Table 1; Fig. 4 presented the statistical parameters obtained by linear and non-linear models. The results showed that the DT classifier exhibited a good fit in both the training and validation sets, with Specificity (Sp) values of 96.4/96.2 and Sensitivity (Sn) values of 79.3/75.7, respectively. Another important statistical parameter included is the Mathew’s Correlation Coefficient (MCC) values [58], giving 0.6722/0.6401 in training/validation series.

Table 1 Statistical parameters used for NANO.PTML models
Fig. 4
figure 4

Summary of the statistical parameters obtained for the linear and non-linear NANO.PTML models. (A) Training set and (B) Validation set

After tuning the hyperparameters to develop the DT algorithm which play a crucial role in determining its performance and behavior [59]. The best combination found were the following; The ccp-alpha parameter, set to 0.0, controls the complexity of the tree by correcting excessive branching and preventing overfitting. The class-weight parameter assigns weights to different classes within the dataset, in this case we set class 0 at 40% and class 1 at 60%, addressing potential imbalances in class distribution. The choice of criterion as “gini” indicates the use of Gini impurity as the measure of split quality, influencing how the tree partitions the feature space. Furthermore, max-depth is set to 15, limiting the depth of the tree to prevent it from growing overly complex and overfitting to the training data. The max-features and max-leaf-nodes parameters, both set to “None”, which allow the tree to explore all available features and leaf node possibilities, respectively, without imposing additional constraints. The min-impurity-decrease set at 0.0 defines the minimum impurity decrease required for a split, regulating the tree’s growth. The min-samples-leaf and min-samples-split, both set to 5 and 2 respectively. These parameters establish the minimum number of samples required in a leaf node or for a node split, contributing to the ability of generalizing the tree and avoiding it from being overly specific to the training data. The min-weight-fraction leaf was set to 0.0, indicating that it was not applied, while the random-state was set to 42, ensuring reproducibility of results across different runs of the model. Finally, splitter as “best” indicates that the best split at each node is determined based on the chosen criterion, enabling optimal tree construction. Further information about these parameters can be found in the documentation provided by the Scikit-learn library [52]. The hyperparameter used for LDA, kNN, etc. can be found in Table S1 Supporting Information SI00.docx.

Figure 5 depicts the structure of the decision tree, comprising 3249 nodes with a depth of 15 layers and terminating in 1625 leaf nodes. Final predictions or decisions are made based on the input data [60]. To facilitate better understanding of this tree plot, we have focused the explanation on a tree depth of 2 layers, resulting in 4 leaf nodes, which collectively form 7 main families. This analysis involved input variables such as ΔDVvdwMGcoat(cIII)nj, f(vij(cd0),vnj (cn0))ref, and ΔDLnp(cIII)nj. Full information of the description for each family can be seen in Table 2. For example, in family i, composed by NPs with lower McGowan volume deviation than Families v-vii and lower prior probability of activity than families ii-iv.

Fig. 5
figure 5

Plot for DT structure

Table 2 Description of the 7 main families within the DT structure. The color of each family consistently matches that depicted in Fig. 4

Overall, this implies smaller NPs, possibly with lower polarizability, and lower expected biological property values suggesting overall reduced drug-NP activity likelihood. The 0.4% of cases are predicted as class 1. Consequently, NPs in this family should not be short-list for assay according to the DT model. However, on the right section of the DT, family ii, composed by NPs with higher McGowan volume deviation than Family i and lower prior probability of activity than families iii and iv. General, this indicates larger NPs, possibly with higher polarizability, and lower expected biological property values for Drug and NP suggesting overall increased activity likelihood. The 1.5% of cases are predicted as class 1. Therefore, NPs in this family should not be short-list for assay according to the model. However, families iii and iv yielded more promising results, with 4% and 3.3% of class 1, respectively. Family iii suggests smaller NPs, possibly with lower polarizability and low to medium expected biological property values, indicating an overall reduced likelihood of drug-NP activity. Conversely, family iv suggests larger NPs with higher polarizability. Medium to high biological property values indicate a higher likelihood of drug-NP activity.

Another statistical metric used in this study is the Area Under Receiver Operating Characteristic (AUROC), for both training and validation set, see Fig. 6 [48]. A high AUROC value indicates better overall performance of the model in terms of its ability to correctly classify instances from both classes. An AUROC of 1.0 represents a perfect classifier, while an AUROC of 0.5 indicates a classifier that performs no better than random guessing [48]. The highest AUROC values, 0.97 − 0.96, are obtained by the DT algorithm, which accordingly matches the results of Sn/Sp in the training/validation set. Whereas, the LDA algorithm is not among the top-performing classifiers, with AUROC values ranging from 0.73 to 0.74.

Fig. 6
figure 6

AUROC exploration of NANO.PTML models in both training (A) and validation (B) set

Contrast with earlier AI/ML algorithms

Other research jobs have showed in the recent investigation a wide variety of problems relating with NPs and/or NDDs discovery, see Table 3. Actually, the majority of these researches explore the cytotoxicity of NP assays or NDDs against a large number of species by applying NANO.PTML models. Nevertheless, to the best of our knowledge, there are not study that includes both NP and NDDs component simultaneously or the opportunity of developing N2D3 systems. For example, Kleandrova et al. developed an combined QSTR-perturbation model to simultaneously explore ecotoxicity and cytotoxicity of NPs under different experimental conditions, including diverse measures of toxicities, multiple biological targets, compositions, sizes and conditions to measure those sizes, shapes, times during which the biological targets were exposed to NPs, and coating agents [44]. The model was obtained from 36,488 cases of NP-NP pairs. Nevertheless, in this research Kelandrova et al. is only restricted to the study of ecotoxicity and cytotoxicity of NPs and does not contemplate the data about NDDs components. Similarly, Cordeiro et al. built up the QSAR-perturbation model which involves 5520 cases (NP–NP pairs). The aim of this model is the simultaneous prediction of the ecotoxicity of NPs against several assay organisms (bio-indicators), by considering also multiple measures of ecotoxicity, as well as the chemical compositions, sizes, conditions under which the sizes were measured, shapes, and the time during which the diverse assay organisms were exposed to nanoparticles [40]. As the previous model, they do not take into account the NDDs biological activity. On the other hand, Luan et al. generated the mx-QSAR model from 4915 cases of multiple assays of neurotoxicity/neuroprotective effects of drugs. In addition, the model was trained with a dataset which involved diverse assay endpoints of 2217 compounds. Each compound was assayed in at least one out of 338 assays, which included 148 molecular or cellular targets and 35 standard type measures in 11 model organisms (including human).Unlike previous models, this mx-QSAR algorithm contained information NDDs, however, it does not consider the NP as part of this system [61]. In this paper, we developed an innovative system including both NP and NDDs components. The results of the NANO.PTML-DT was quite satisfactory Sp values of 96.4/96.2 and Sn values of 79.3/75.7 in training and validation series including 375 K and 125 K cases, respectively. Other research with similar scope as the present work, García et al. built up the LDA linear model in order to predict the results of 42 different experimental tests for GSK-3 inhibitors with heterogeneous structural patterns. GSK-3β inhibitors are interesting candidates for developing anti-Alzheimer compounds among others urgent diseases. These authors obtained Sn/Sp ≈ 90% in training/validation series [62]. On the other hand, Ferreira da Costa et al. constructed LDA model so as to predict the properties of a query compound or molecular system in experimental assays with multiple boundary conditions involved in the dopamine pathway. They obtained Sn/Sp ≈ 70–91% in both training and validation series [63]. However, it is worth mentioning that the contract of statistical parameters between the model of this work and the previous one is not informative at all due to the fact that the design of each model is specific to the problem to be dealt with.

Table 3 NDDs and NP cytotoxicity study using AI/ML algorisms in previous research works

Experimental study of new system

Characterization of Fe3O4nanoparticles

Initially the hydrophobic NPs (samples Fe3O4_A and Fe3O4_B) have been structurally, morphologically and magnetically characterized (Table 4). Both samples present the inverse spinel structure of magnetite (Fe3O4, S.G. Fd-3 m) with no traces of secondary phases. The crystallite sizes of the samples were calculated from the maximum diffraction peak (311) of X-ray powder diffraction patterns using Scherrer’s equation. The calculated crystallite sizes of the two samples are around 24 nm and are compatible with the average physical size determined by TEM analysis (see Table 4; Fig. 7). The rather good agreement between the two techniques (DRX and TEM) indicate that the NPs of both samples are composed of single nanocrystals. In relation to the morphology of the NPs, sample Fe3O4_A is composed of NPs with more facets (cuboctahedrons), while the NPs of sample Fe3O4_B present octahedral-like shape as it can be seen in Fig. 7a) and b), respectively.

Table 4 Summary of the features of the two Fe3O4 NP samples: Weight% of the organic matter (O.M.%) in the as-synthesized hydrophobic NPs, size of crystalline domain (DDXR) by Scherrer calculation from the main (311) diffraction peak, the average dimension of the inorganic core obtained by TEM (DTEM), saturation magnetization (MS) of the inorganic core at RT and the hydrodynamic size (DH) and Z potential (Z) of the hydrophilic NPs coated with CTAB
Fig. 7
figure 7

TEM micrographs of (a) Fe3O4_A (cuboctahedral) and (b) Fe3O4_B (octahedral) nanoparticles. Large scale bars: 100 nm. Zoom scale bars: 10 nm, (c) a representative Electron Diffraction (ED) pattern corresponding to sample Fe3O4_A and (d) M (H) curves of both samples at room temperature

The magnetization dependence with the magnetic field (M(H)) in the two samples has been carried out by DC Magnetometry at RT. The M(H) curves of Fig. 7d display saturation magnetizations (MS) of 88 and 91 Am2/kgFe3O4, respectively, which proves the high quality of the magnetite phase and the purity of the inorganic core. After coating the hydrophobic NPs with CTAB, both samples (Fe3O4_A@CTAB and Fe3O4_B@CTAB NPs) become highly soluble in water as it is shown by the Z potential values, which are positive due to the cationic nature of the CTAB molecule (see Table 4; Fig. 7b). Regarding the degree of agglomeration of the NPs in water dispersion, it can be claimed that these NPs are arranged in small clusters (2–5 NPs) because they present moderate hydrodynamic diameters (see Table 4) in comparison to the average diameter of a single NP determined by DRX and TEM.

This experimental section is focused specifically on the NP core of Fe3O4 with two shapes (cuboctahedral and octahedral) and on the CTAB coating. We performed a computational analysis to demonstrate the practical application of the NANO.PTML model using a real-world wet-laboratory scenario. Additionally, we carried out a simulation experiment that try to mimic this experimental part. For this purpose, we created a prediction dataset with various combinations of NP systems including NP cores, coating agents, cell lines, shapes, and anti-neurodegenerative drugs linked with certain coatings. It is important to note that the total number of combinations, considering NP cores, cell lines, shapes, coating agents, and anti-neurodegenerative drugs, amounted to Ntot = n(NP cores) · n(cell lines) · n(NP shapes) · n(NP coats) · n(drugs) = 5 · 53 · 5 · 16 · 123 = 2,607,600 assays. Performing all these combinations in a wet-laboratory is impractical, time-consuming, and resource-intensive. Even with expert criteria, the number of assays remains unmanageable for study. Therefore, the NANO.PTML-DT approach is introduced to address this issue by reducing the number of assays and serving as a guide for the experimental part, highlighting the most promising combinations within the NP systems as drug carriers for neurodegenerative diseases.

Experimental vs. computational illustrative case of study

NANO.PTML-DT simulation experiment

In this section, a computational case study was presented to simulate the Fe3O4_A@CTAB and Fe3O4_B@CTAB NPs from the experimental study detailed in this paper (Fig. 8). The aim of this simulation experiment was to forecast the best combination of the NPs core vs. cell lines (cytotoxicity or ecotoxicity) vs. shapes vs. coating agents as mentioned in the previous section. In this scenarios, we created a total of 500,000 assays as new prediction dataset, which was formed by up to n(NPs core) = 5, n(cn1 = cell lines) = 53, n(cn2 = NP shapes) = 5, n(NPs coat) = 16 and n(drugs) = 123.

Fig. 8
figure 8

Workflow of experimental illustrative simulation experiment using NANO.PTML approach

On the other hand, the DT model was selected due to the good performance of the statistical parameters in both training and validation set, as shown in Table 1. The probability p(NANO.PTMLin)cnj values were acquired with NANO.PTMLin system. The heatmap shown in Fig. 9 illustrates the findings using a 3-color scale based on probability values: the green zone represents a high probability range, the yellow zone signifies a moderate to low probability range, and the red zone indicates a very low predicted probability. Assays that had never been reported previously or had very low representation in the original dataset, as well as insignificant combinations of NP systems were depicted in white to prevent overestimation in the results. Additionally, the columns of this heatmap represented the NP core, cell lines, and NP morphology. The column for cell lines was further categorized into cytotoxicity and ecotoxicity. The rows of the heatmap corresponded to the NP coats studied, arranged based on their MacGowan volume_n values. Furthermore, the heatmap contained information regarding the frequency of each combination appearing in both the columns and rows within the prediction dataset.

Fig. 9
figure 9

NANO.PTML systems experiment simulation

The prediction was carried out taking into account the cytotoxicity and eco-toxicity. It is crucial the study of the cytotoxicity as NPs are increasingly employed in medical diagnostics and therapies to enhance our comprehension, detection, and treatment of human diseases. The exposure of NPs in consumer products or their use in emerging biomedical applications, such as drug delivery, biosensors, [69] or imaging agents, [70] entails direct ingestion or injection into the body [71]. Additionally, the study of eco-toxicity is critical for assessing their impact on ecosystems, wildlife, and human health [72]. It helps in understanding how NPs interact with the environment, entering food chains and potentially affecting biodiversity.

In this context, the outcomes of the DT model highlighted certain NANO.PTML systems as promising candidates for further investigation. Interestingly, the high prediction value of Lycopersicon esculentum proved to be a favorable ecotoxicity cell line, exhibiting high probability values with the majority of coating systems. Contrarily, the least propitious cell lines were Danio rerio (embryos), Danio rerio (juvenile), Danio rerio (adults), Oryzias latipes (adults), Ceriodaphnia dubia (neonates), Daphnia pulex (adults), Chlorella sp., and Scenedesmus sp., which yielding in medium to low probability values. On the other hand, one more important characteristic is MacGowan volume which has been widely used in many areas to estimate the physicochemical and biochemical properties of molecules, [73, 74] specifically for CTAB, PS, and PEG as coating agents, with an exception in PVA. The combination of elliptical-shaped NPs with PVA as a coating agent in cytotoxicity cell lines appears to be a promising candidate for further synthesis. Another important factor is the type of the cell line which obtained higher probability value with cytotoxicity. It is important to note that all predictions generated by this method should be approached with caution and necessitate experimental validation. The NANO.PTML-DT method holds potential for accelerating experimental studies and offering cost-effective preliminary results for a vast database of NANO.PTML systems. This methodology presents an effective and robust tool for guiding experimental research, offering an alternative to laborious trial-and-error testing.

General applications of NANO.PTML-DT model

The NANO.PTML model has different types of applications in various stages of N2D3 system development, as shown in Fig. 10. It includes the selection of new cores, coats, or drugs. In all these cases, the N2D3 systems can be optimized in terms of drug activity and NP system safety (cytotoxicity and ecotoxicity). The first three applications involve the selection of input variables. In the NP core scanning stage, researchers can select different types, sizes, and shapes. In the NP coats scanning stage, they can select up to 16 coating agents, such as CTAB, PVA, PVP, etc. In the drug scanning stage, they can carry out NDDs synthesis modifications, repurposing, and patent greening. The synthesis modifications refer to the prediction of new N2D3 systems (different coats, cores) for new drug structures with potential NDDs activity [75]. Repurposing refers to the prediction of new NDDs for N2D3 systems from already known drugs with other activities [76]. Patent greening applications refer to the prediction of new N2D3 systems (different coats, cores) for already known NDDs [77]. In all these cases, the outcomes predicted by the NANO.PTML model can optimize NP safety and/or biological activity of NDDs. To make these predictions, we have to change the values of different input variables. In Fig. 10, we highlighted the input variables that need to be changed to make predictions for different applications. For details about the variables, see AI/ML Python Computational Models section. The variables in these four stages can be changed one by one according to the researchers’ needs; however, they can also be changed simultaneously. For example, in the simulation experiment shown in Fig. 9, we created a total of 500,000 assays in which up to 123 drugs, 53 cell lines, 16 NP coats, 5 NP shapes, and 5 NP cores were changed at the same time.

Fig. 10
figure 10

Others applications of NANO.PTML model

Conclusions

The NANO.PTML model, which integrates NDDs and NP models, offers a practical solution for developing new NP system as drug carriers for neurodegenerative diseases. It effectively addresses the challenge of exploring numerous NP and NDDs compound combinations. The best-performing AI/ML model, using the DT algorithm, achieved high Sp (96.4%/96.2%) and Sn (79.3%/75.7%) in training and validation, with AUROC values of 0.97 and 0.96. Chemically synthesized Fe3O4 NPs were structurally characterized and coated with CTAB to enhance water solubility. We illustrated an example of the IFPTML-DT model application in a real experiment (reported here). To do this, we performed an experimental simulation using a large prediction dataset including 500,000 cases/empirical experiments similar to NPs studied in the experimental part. This simulation experiment showed that certain NP systems as promising candidate for further investigation, highlighting the Lycopersicon esculentum cell line for ecotoxicity studies according green section of Fig. 9. The MacGowan volume was significant for certain coating agents (CTAB, PS, PEG) but not for PVA. Overall, the NANO.PTML model expedites experimental research and provides reliable initial findings, reducing the reliance on time-consuming wet-lab procedures.

Data availability

The datasets generated and/or analyzed during the current study are available in the Figshare repository, DOI: 10.6084/m9.figshare.25450291. On the other hand, the code of the NANO.PTML models was uploaded to a GitHub repository and is available free for use by researchers. For the NANO.PTML models code the link is: https://github.com/she012/NANO.PTML-project.

Abbreviations

AI:

Artificial Intelligence refers to the use of computers to simulate intelligent behavior with minimal human involvement [78]

AUROC:

Area Under Receiver Operating Characteristic is commonly used to assess the accuracy of diagnostic tests. A ROC curve that is closer to the upper left corner of the graph indicates higher test accuracy, as this point represents a sensitivity of 1 and a false positive rate of 0 (specificity of 1) [79]

BBB:

Blood-Brain Barrier is a selective permeability barrier formed by the blood vessels that vascularize the central nervous system [80]

CNS:

Central Nervous System is the brain and spinal cord. This system is responsible of receiving, processing, and responding to sensory information [81]

CTAB:

Cetyl Trimethyl Ammonium Bromide is a cationic surfactant

DLS:

Dynamic Light Scattering measures changes in scattering intensity over time caused by particles moving randomly due to Brownian motion [82]

DNDS:

Drug Nanoparticle Delivery System

DRX:

X Ray Diffraction has become a widely used method for examining crystal structures and atomic spacing [83]

DT:

Decision Tree is a hierarchical decision support model that employs a tree-like structure to represent decisions and their potential outcomes [54]

GB:

Gradient Boosting is an ensemble machine learning technique that sequentially combines the predictions of multiple weak learners, usually decision trees [57]

IF:

Information Fusion is the process of integrating data from different sources

kNN:

K-Nearest Neighbors is a popular machine learning technique which based on the principle that data points which are close to each other are likely to share similar labels or values [84].

LDA:

Linear Discriminant Analyses is a machine learning method that aims to identify a linear combination of features that effectively distinguishes between two or more classes of objects or events [85]

NDs:

Neurodegenerative Diseases

MA:

Moving Average is a method used to analyze data points by calculating a series of average values from various subsets of the entire dataset [86]

MCC:

Mathew’s Correlation Coefficient is a metric for binary classification that evaluates predictions by considering true positives, true negatives, false positives, and false negatives[87

ML:

Machine Learning is the science of developing algorithms and statistical models that enable computer systems to perform tasks without explicit instructions, relying instead on patterns and inferences [88]

NP:

Nanoparticle

NDDs:

Neurodegenerative Disease Drugs

N2D3:

Nanoparticle Neuronal Disease Drug Delivery systems

PMAO:

Poly(Maleic Anhydride-alt-1-Octadecene)

PT:

Perturbation Theory involves starting with a simple system for which a mathematical solution is already known. Then, an additional perturbation is introduced to represent a weak disturbance to the system [89]

PTOs:

Perturbation Theory Operators is the linear and non-linear transformations of moving average. For example, the deviation of the moving average

RF:

Radom Forest is a popular machine learning technique that combines the collective predictions of numerous decision trees to generate a combined outcome [90]

RT:

Room Temperature

TEM:

Transmission Electron Microscopy is a method of microscopy where a specimen is illuminated with a beam of electrons, allowing the formation of an image as the electrons pass through the specimen [91]

TGA:

ThermoGravimetric Analysis is a method used to assess the thermal stability of materials, including polymers, by measuring changes in weight as a function of temperature [92]

VSM:

Vibrating-Sample Magnetometer is a device used to measure the magnetic properties of a sample while it undergoes perpendicular vibrations within a uniform magnetic field [93]

References

  1. Asefy Z, Hoseinnejhad S, Ceferov Z. Nanoparticles approaches in neurodegenerative diseases diagnosis and treatment. Neurol Sci. 2021;42:2653–60.

    Article  PubMed  Google Scholar 

  2. Agnello L, Ciaccio M. Neurodegenerative diseases: from molecular basis to Therapy. Volume 23. pp. 12854: MDPI; 2022. p. 12854.

  3. Fahn S. The 200-year journey of Parkinson disease: reflecting on the past and looking towards the future. Parkinsonism Relat Disord. 2018;46:S1–5.

    Article  PubMed  Google Scholar 

  4. Knopman DS, Amieva H, Petersen RC, Chételat G, Holtzman DM, Hyman BT, Nixon RA, Jones DT. Alzheimer disease. Nat Reviews Disease Primers. 2021;7:33.

    Article  PubMed  Google Scholar 

  5. Abramov AY, Berezhnov AV, Fedotova EI, Zinchenko VP, Dolgacheva LP. Interaction of misfolded proteins and mitochondria in neurodegenerative disorders. Biochem Soc Trans. 2017;45:1025–33.

    Article  CAS  PubMed  Google Scholar 

  6. Zakharova M. Modern approaches in gene therapy of motor neuron diseases. Med Res Rev. 2021;41:2634–55.

    Article  CAS  PubMed  Google Scholar 

  7. Myasoedov NF, Lyapina LA, Andreeva LA, Grigorieva ME, Obergan TY, Shubina TA. The modern view on the role of glyprolines by metabolic syndrome. Med Res Rev. 2021;41:2823–40.

    Article  CAS  PubMed  Google Scholar 

  8. Kukharsky MS, Skvortsova VI, Bachurin SO, Buchman VL. In a search for efficient treatment for amyotrophic lateral sclerosis: old drugs for new approaches. Med Res Rev. 2021;41:2804–22.

    Article  PubMed  Google Scholar 

  9. Gudasheva TA, Povarnina PY, Tarasiuk AV, Seredenin SB. Low-molecular mimetics of nerve growth factor and brain‐derived neurotrophic factor: design and pharmacological properties. Med Res Rev. 2021;41:2746–74.

    Article  CAS  PubMed  Google Scholar 

  10. Dobrovolskaia MA, Shurin M, Shvedova AA. Current understanding of interactions between nanoparticles and the immune system. Toxicol Appl Pharmacol. 2016;299:78–89.

    Article  CAS  PubMed  Google Scholar 

  11. Moghimi SM. Chemical camouflage of nanospheres with a poorly reactive surface: towards development of stealth and target-specific nanocarriers. Biochim et Biophys Acta -Molecular Cell Res. 2002;1590:131–9.

    Article  CAS  Google Scholar 

  12. Delche NA, Kheiri R, Nejad BG, Sheikhi M, Razavi MS, Rahimzadegan M, Salmasi Z. Recent progress in the intranasal PLGA-based drug delivery for neurodegenerative diseases treatment. Iran J Basic Med Sci. 2023;26:1107.

    Google Scholar 

  13. Steeland S, Vandenbroucke RE, Libert C. Nanobodies as therapeutics: big opportunities for small antibodies. Drug Discovery Today. 2016;21:1076–113.

    Article  CAS  PubMed  Google Scholar 

  14. Vio V, Jose Marchant M, Araya E, Kogan J. Metal nanoparticles for the treatment and diagnosis of neurodegenerative brain diseases. Curr Pharm Design. 2017;23:1916–26.

    Article  CAS  Google Scholar 

  15. Shah R, Eldridge D, Palombo E, Harding I. Lipid nanoparticles: production, characterization and stability. Springer; 2015.

  16. Elzahhar P, Belal AS, Elamrawy F, Helal NA, Nounou MI. Bioconjugation in drug delivery: practical perspectives and future perceptions. Pharm Nanotechnology: Basic Protocols 2019:125–82.

  17. Jones DE, Ghandehari H, Facelli JC. Comput Methods Programs Biomed. 2016;132:93–103. A review of the applications of data mining and machine learning for the prediction of biomedical properties of nanoparticles.

  18. Sayes C, Ivanov I. Comparative study of predictive computational models for nanoparticle-induced cytotoxicity. Risk Analysis: Int J. 2010;30:1723–34.

    Article  Google Scholar 

  19. Makadia HK, Siegel SJ. Poly lactic-co-glycolic acid (PLGA) as biodegradable controlled drug delivery carrier. Polymers. 2011;3:1377–97.

    Article  CAS  PubMed  Google Scholar 

  20. Gajewicz A. What if the number of nanotoxicity data is too small for developing predictive Nano-QSAR models? An alternative read-across based approach for filling data gaps. Nanoscale. 2017;9:8435–48.

    Article  CAS  PubMed  Google Scholar 

  21. Oksel C, Ma CY, Liu JJ, Wilkins T, Wang XZ. (Q) SAR modelling of nanomaterial toxicity: a critical review. Particuology. 2015;21:1–19.

    Article  CAS  Google Scholar 

  22. Winkler DA, Mombelli E, Pietroiusti A, Tran L, Worth A, Fadeel B, McCall MJ. Applying quantitative structure–activity relationship approaches to nanotoxicology: current status and future potential. Toxicology. 2013;313:15–23.

    Article  CAS  PubMed  Google Scholar 

  23. Gajewicz A, Rasulev B, Dinadayalane TC, Urbaszek P, Puzyn T, Leszczynska D, Leszczynski J. Advancing risk assessment of engineered nanomaterials: application of computational approaches. Adv Drug Deliv Rev. 2012;64:1663–93.

    Article  CAS  PubMed  Google Scholar 

  24. Tantra R, Oksel C, Puzyn T, Wang J, Robinson KN, Wang XZ, Ma CY, Wilkins T. Nano (Q) SAR: challenges, pitfalls and perspectives. Nanotoxicology. 2015;9:636–42.

    Article  CAS  PubMed  Google Scholar 

  25. Oksel C, Ma CY, Liu JJ, Wilkins T, Wang XZ. Literature review of (Q) SAR modelling of nanomaterial toxicity. Modelling Toxic Nanopart 2017:103–42.

  26. Santana R, Zuluaga R, Ganan P, Arrasate S, Onieva E, Gonzalez-Diaz H. Designing nanoparticle release systems for drug-vitamin cancer co-therapy with multiplicative perturbation-theory machine learning (PTML) models. Nanoscale. 2019;11:21811–23.

    Article  CAS  PubMed  Google Scholar 

  27. Santana R, Zuluaga R, Ganan P, Arrasate S, Onieva E, Gonzalez-Diaz H. Predicting coated-nanoparticle drug release systems with perturbation-theory machine learning (PTML) models. Nanoscale. 2020;12:13471–83.

    Article  CAS  PubMed  Google Scholar 

  28. Dieguez-Santana K, Gonzalez-Diaz H. Towards machine learning discovery of dual antibacterial drug-nanoparticle systems. Nanoscale. 2021;13:17854–70.

    Article  CAS  PubMed  Google Scholar 

  29. Ortega-Tenezaca B, Gonzalez-Diaz H. IFPTML mapping of nanoparticle antibacterial activity vs. pathogen metabolic networks. Nanoscale. 2021;13:1318–30.

    Article  CAS  PubMed  Google Scholar 

  30. Castellanos-Rubio I, Munshi R, Qin Y, Eason DB, Orue I, Insausti M, Pralle A. Multilayered inorganic–organic microdisks as ideal carriers for high magnetothermal actuation: assembling ferrimagnetic nanoparticles devoid of dipolar interactions. Nanoscale. 2018;10:21879–92.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Castellanos-Rubio I, Arriortua O, Iglesias-Rojas D, Barón A, Rodrigo I, Marcano L, Garitaonandia JS, Orue Ia, Fdez-Gubieda ML, Insausti M. A milestone in the chemical synthesis of Fe3O4 nanoparticles: unreported bulklike properties lead to a remarkable magnetic hyperthermia. Chem Mater. 2021;33:8693–704.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Nader K, Castellanos-Rubio I, Orue I, Iglesias-Rojas D, Barón A, de Muro IG, Lezama L, Insausti M. Getting insight into how iron (III) oleate precursors affect the features of magnetite nanoparticles. J Solid State Chem. 2022;316:123619.

    Article  CAS  Google Scholar 

  33. Castellanos-Rubio I, Rodrigo I, Olazagoitia-Garmendia A, Arriortua O, Gil de Muro I, Garitaonandia JS, Bilbao JRn, Fdez-Gubieda ML, Plazaola F, Orue Ia. Highly reproducible hyperthermia response in water, agar, and cellular environment by discretely PEGylated magnetite nanoparticles. ACS Appl Mater Interfaces. 2020;12:27917–29.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. He S, Abarrategi JS, Bediaga H, Arrasate S, González-Díaz H. On Additive Artificial Intelligence Discovery of Nanoparticle-Neurodegenerative Disease Drug Delivery Systems. Beilstein Archives 2024; 2024:10.

  35. Bento AP, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, Kruger FA, Light Y, Mak L, McGlinchey S, et al. The ChEMBL bioactivity database: an update. Nucleic Acids Res. 2014;42:D1083–1090.

    Article  CAS  PubMed  Google Scholar 

  36. Davies M, Nowotka M, Papadatos G, Dedman N, Gaulton A, Atkinson F, Bellis L, Overington JP. ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015;43:W612–620.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40:D1100–1107.

    Article  CAS  PubMed  Google Scholar 

  38. Concu R, Kleandrova VV, Speck-Planche A, Cordeiro M. Probing the toxicity of nanoparticles: a unified in silico machine learning model based on perturbation theory. Nanotoxicology. 2017;11:891–906.

    Article  CAS  PubMed  Google Scholar 

  39. Luan F, Kleandrova VV, Gonzalez-Diaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MN. Computer-aided nanotoxicology: assessing cytotoxicity of nanoparticles under diverse experimental conditions by using a novel QSTR-perturbation approach. Nanoscale. 2014;6:10623–30.

    Article  CAS  PubMed  Google Scholar 

  40. Kleandrova VV, Luan F, Gonzalez-Diaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MN. Computational ecotoxicology: simultaneous prediction of ecotoxic effects of nanoparticles under different experimental conditions. Environ Int. 2014;73:288–94.

    Article  CAS  PubMed  Google Scholar 

  41. Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D, Mutowo P, Atkinson F, Bellis LJ, Cibrian-Uhalte E, et al. The ChEMBL database in 2017. Nucleic Acids Res. 2017;45:D945–54.

    Article  CAS  PubMed  Google Scholar 

  42. Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Felix E, Magarinos MP, Mosquera JF, Mutowo P, Nowotka M, et al. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 2019;47:D930–40.

    Article  CAS  PubMed  Google Scholar 

  43. Moriwaki H, Tian Y-S, Kawashita N, Takagi T. Mordred: a molecular descriptor calculator. J Cheminform. 2018;10:1–14.

    Article  Google Scholar 

  44. Kleandrova VV, Luan F, Gonzalez-Diaz H, Ruso JM, Speck-Planche A, Cordeiro MN. Computational tool for risk assessment of nanomaterials: novel QSTR-perturbation model for simultaneous prediction of ecotoxicity and cytotoxicity of uncoated and coated nanoparticles under multiple experimental conditions. Environ Sci Technol. 2014;48:14686–94.

    Article  CAS  PubMed  Google Scholar 

  45. Luan F, Kleandrova VV, González-Díaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MNDS. Computer-aided nanotoxicology: assessing cytotoxicity of nanoparticles under diverse experimental conditions by using a novel QSTR-perturbation approach. Nanoscale. 2014;6:10623–30.

    Article  CAS  PubMed  Google Scholar 

  46. Santana R, Zuluaga R, Ganan P, Arrasate S, Onieva E, Montemore MM, Gonzalez-Diaz H. PTML Model for Selection of Nanoparticles, anticancer drugs, and vitamins in the design of drug-vitamin nanoparticle Release systems for Cancer Cotherapy. Mol Pharm. 2020;17:2612–27.

    Article  CAS  PubMed  Google Scholar 

  47. Urista DV, Carrue DB, Otero I, Arrasate S, Quevedo-Tumailli VF, Gestal M, Gonzalez-Diaz H, Munteanu CR. Prediction of Antimalarial Drug-decorated nanoparticle Delivery systems with Random Forest models. Biology (Basel) 2020; 9.

  48. Hill T, Lewicki P. Statistics: Methods and Applications. 1st edition edn: StatSoft, Inc.; 2005.

  49. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40:D1100–7.

    Article  CAS  PubMed  Google Scholar 

  50. Huberty CJ, Olejnik S. Applied MANOVA and discriminant analysis. 2nd ed. Hoboken, New Jersey: John Wiley & Sons, Inc.; 2006.

    Book  Google Scholar 

  51. Hanczar B, Hua J, Sima C, Weinstein J, Bittner M, Dougherty ER. Small-sample precision of ROC-related estimates. Bioinformatics. 2010;26:822–30.

    Article  CAS  PubMed  Google Scholar 

  52. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.

    Google Scholar 

  53. Anowar F, Sadaoui S, Selim B. Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, Le, Ica, t-sne). Comput Sci Rev. 2021;40:100378.

    Article  Google Scholar 

  54. Kotsiantis SB. Decision trees: a recent overview. Artif Intell Rev. 2013;39:261–83.

    Article  Google Scholar 

  55. Breiman L. Random forests. Mach Learn. 2001;45:5–32.

    Article  Google Scholar 

  56. Peterson LE. K-nearest neighbor. Scholarpedia. 2009;4:1883.

    Article  Google Scholar 

  57. Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat 2001:1189–232.

  58. Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21:1–13.

    Article  Google Scholar 

  59. Feurer M, Springenberg J, Hutter F. Initializing bayesian hyperparameter optimization via meta-learning. In Proceedings of the AAAI Conference on Artificial Intelligence. 2015.

  60. Swain PH, Hauska H. The decision tree classifier: design and potential. IEEE Trans Geoscience Electron. 1977;15:142–7.

    Article  Google Scholar 

  61. Luan F, Cordeiro MN, Alonso N, Garcia-Mera X, Caamano O, Romero-Duran FJ, Yanez M, Gonzalez-Diaz H. TOPS-MODE model of multiplexing neuroprotective effects of drugs and experimental-theoretic study of new 1,3-rasagiline derivatives potentially useful in neurodegenerative diseases. Bioorg Med Chem. 2013;21:1870–9.

    Article  CAS  PubMed  Google Scholar 

  62. Garcia I, Fall Y, Gomez G, Gonzalez-Diaz H. First computational chemistry multi-target model for anti-Alzheimer, anti-parasitic, anti-fungi, and anti-bacterial activity of GSK-3 inhibitors in vitro, in vivo, and in different cellular lines. Mol Divers. 2011;15:561–7.

    Article  CAS  PubMed  Google Scholar 

  63. Ferreira da Costa J, Silva D, Caamaño O, Brea JM, Loza MI, Munteanu CR, Pazos A, García-Mera X, Gonzalez-Diaz H. Perturbation theory/machine learning model of ChEMBL data for dopamine targets: docking, synthesis, and assay of new l-prolyl-l-leucyl-glycinamide peptidomimetics. ACS Chem Neurosci. 2018;9:2572–87.

    Article  CAS  PubMed  Google Scholar 

  64. Kar S, Gajewicz A, Puzyn T, Roy K, Leszczynski J. Periodic table-based descriptors to encode cytotoxicity profile of metal oxide nanoparticles: a mechanistic QSTR approach. Ecotoxicol Environ Saf. 2014;107:162–9.

    Article  CAS  PubMed  Google Scholar 

  65. Jagiello K, Grzonkowska M, Swirog M, Ahmed L, Rasulev B, Avramopoulos A, Papadopoulos MG, Leszczynski J, Puzyn T. Advantages and limitations of classic and 3D QSAR approaches in nano-QSAR studies based on biological activity of fullerene derivatives. J Nanopart Res. 2016;18:1–16.

    Article  CAS  Google Scholar 

  66. Mikolajczyk A, Gajewicz A, Mulkiewicz E, Rasulev B, Marchelek M, Diak M, Hirano S, Zaleska-Medynska A, Puzyn T. Nano-QSAR modeling for ecosafe design of heterogeneous TiO 2-based nano-photocatalysts. Environ Science: Nano. 2018;5:1150–60.

    CAS  Google Scholar 

  67. Durán FJR, Alonso N, Caamaño O, García-Mera X, Yañez M. Multi-Target Prediction of Neuroprotective Drugs, Synthesis, Assay, and Theoretical Study of Rasagiline Carbamates. 2015.

  68. Romero-Duran FJ, Alonso N, Yanez M, Caamano O, García-Mera X, Gonzalez-Diaz H. Brain-inspired cheminformatics of drug-target brain interactome, synthesis, and assay of TVP1022 derivatives. Neuropharmacology. 2016;103:270–8.

    Article  CAS  PubMed  Google Scholar 

  69. Doria G, Conde J, Veigas B, Giestas L, Almeida C, Assunção M, Rosa J, Baptista PV. Noble metal nanoparticles for biosensing applications. Sensors. 2012;12:1657–87.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Rabin O, Manuel Perez J, Grimm J, Wojtkiewicz G, Weissleder R. An X-ray computed tomography imaging agent based on long-circulating bismuth sulphide nanoparticles. Nat Mater. 2006;5:118–22.

    Article  CAS  PubMed  Google Scholar 

  71. Lewinski N, Colvin V, Drezek R. Cytotoxicity of nanoparticles. Small. 2008;4:26–49.

    Article  CAS  PubMed  Google Scholar 

  72. Handy RD, Owen R, Valsami-Jones E. The ecotoxicology of nanoparticles and nanomaterials: current status, knowledge gaps, challenges, and future needs. Ecotoxicology. 2008;17:315–25.

    Article  CAS  PubMed  Google Scholar 

  73. Abraham MH, Chadha HS, Martins F, Mitchell RC, Bradbury MW, Gratton JA. Hydrogen bonding part 46: a review of the correlation and prediction of transport properties by an LFER method: physicochemical properties, brain penetration and skin permeability. Pest Sci. 1999;55:78–88.

    CAS  Google Scholar 

  74. Abraham MH, McGowan J. The use of characteristic volumes to measure cavity terms in reversed phase liquid chromatography. Chromatographia. 1987;23:243–6.

    Article  CAS  Google Scholar 

  75. Shamsi J, Urban AS, Imran M, De Trizio L, Manna L. Metal halide perovskite nanocrystals: synthesis, post-synthesis modifications, and their optical properties. Chem Rev. 2019;119:3296–348.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Pushpakom S, Iorio F, Eyers PA, Escott KJ, Hopper S, Wells A, Doig A, Guilliams T, Latimer J, McNamee C. Drug repurposing: progress, challenges and recommendations. Nat Rev Drug Discovery. 2019;18:41–58.

    Article  CAS  PubMed  Google Scholar 

  77. Du K, Li P, Yan Z. Do green technology innovations contribute to carbon dioxide emission reduction? Empirical evidence from patent data. Technological Forecast Social Change. 2019;146:297–303.

    Article  Google Scholar 

  78. Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69:S36–40.

    Article  CAS  Google Scholar 

  79. Nahm FS. Receiver operating characteristic curve: overview and practical use for clinicians. Korean J Anesthesiology. 2022;75:25.

    Article  Google Scholar 

  80. Daneman R, Prat A. The blood–brain barrier. Cold Spring Harb Perspect Biol. 2015;7:a020412.

    Article  PubMed  PubMed Central  Google Scholar 

  81. Brodal P. The central nervous system. Oxford University Press; 2010.

  82. Kaszuba M, McKnight D, Connah MT, McNeil-Watson FK, Nobbmann U. Measuring sub nanometre sizes using dynamic light scattering. J Nanopart Res. 2008;10:823–9.

    Article  CAS  Google Scholar 

  83. Bunaciu AA, UdriŞTioiu EG, Aboul-Enein HY. X-ray diffraction: instrumentation and applications. Crit Rev Anal Chem. 2015;45:289–99.

    Article  CAS  PubMed  Google Scholar 

  84. Kramer O, Kramer O. K-nearest neighbors. Dimensionality Reduct Unsupervised Nearest Neighbors 2013:13–23.

  85. Balakrishnama S, Ganapathiraju A. Linear discriminant analysis-a brief tutorial. Inst Signal Inform Process. 1998;18:1–8.

    Google Scholar 

  86. Hunter JS. The exponentially weighted moving average. J Qual Technol. 1986;18:203–10.

    Article  Google Scholar 

  87. Chicco D, Jurman GJBM. The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. 2023; 16:4.

  88. Zhou Z-H. Machine learning. Springer nature; 2021.

  89. Stevenson PM. Optimized perturbation theory. Phys Rev D. 1981;23:2916.

    Article  CAS  Google Scholar 

  90. Biau G, Scornet E. A random forest guided tour. Test. 2016;25:197–227.

    Article  Google Scholar 

  91. Zuo JM, Spence JC. Advanced transmission electron microscopy. Springer; 2017.

  92. Coats A, Redfern J. Thermogravimetric analysis. A review. Analyst. 1963;88:906–24.

    Article  CAS  Google Scholar 

  93. Foner S. Versatile and sensitive vibrating-sample magnetometer. Rev Sci Instrum. 1959;30:548–57.

    Article  Google Scholar 

Download references

Acknowledgements

Mrs. Arrate Bañeres is acknowledged for management support.

Funding

We are grateful to receive the financial support from grants Basque Government / Eusko Jaurlaritza (IT1558-22), SPRI ELKARTEK grants AIMOFGIF (KK-2022/00032), Ministry of Science and Innovation (PID2022-137365NB-I00), and Eusko Jaurlaritza, LANBIDE, INVESTIGO Grants, IKERDATA 2022/IKER/000040 funded by NextGenerationEU funds of European Commission. We also want to acknowledge the Spanish Ministry of Science and Innovation for financial support under grant No. PID2022- 136993OB-I00 (AEI/FEDER, UE), funded by MCIN/AEI/ 10.13039/ 501100011033 and, as appropriate, by “ERDF A way of making Europe”, by the “European Union”. This work was also supported in part by the U.S. Department of Energy (DOE) Grant DE-SC0022239. The work used resources of the Center for Computationally-Assisted Science and Technology (CCAST) at North Dakota State University (Fargo, ND USA), which was made possible in part by the U.S. National Science Foundation (NSF) MRI Award No. 2019077.

Author information

Authors and Affiliations

Authors

Contributions

ICR, MI, SA, and HGD conceived the presented idea. JSA collected the dataset. SH, GMCM, BR, SA and HGD implemented the idea computationally, performed the computations and analysis. KN, ICR and MI performed the nanoparticle synthesis experiments. GMCM, ICR, MI, BR, SA and HGD supervised the findings of this work. All authors discussed the results and wrote the manuscript with input of all authors. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Idoia Castellanos-Rubio or Sonia Arrasate.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, S., Nader, K., Abarrategi, J.S. et al. NANO.PTML model for read-across prediction of nanosystems in neurosciences. computational model and experimental case of study. J Nanobiotechnol 22, 435 (2024). https://doi.org/10.1186/s12951-024-02660-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12951-024-02660-9

Keywords