A local QSAR model based on the stability of nitrenium ions to support the ICH M7 expert review on the mutagenicity of primary aromatic amines

Background Aromatic amines, often used as intermediates for pharmaceutical synthesis, may be mutagenic and therefore pose a challenge as metabolites or impurities in drug development. However, predicting the mutagenicity of aromatic amines using commercially available, quantitative structure–activity relationship (QSAR) tools is difficult and often requires expert review. In this study, we developed a shareable QSAR tool based on nitrenium ion stability. Results The evaluation using in-house aromatic amine intermediates revealed that our model has prediction accuracy of aromatic amine mutagenicity comparable to that of commercial QSAR tools. The effect of changing the number and position of substituents on the mutagenicity of aromatic amines was successfully explained by the change in the nitrenium ion stability. Furthermore, case studies showed that our QSAR tool can support the expert review with quantitative indicators. Conclusions This local QSAR tool will be useful as a quantitative support tool to explain the substituent effects on the mutagenicity of primary aromatic amines. By further refinement through method sharing and standardization, our tool can support the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) M7 expert review with quantitative indicators.


Introduction
The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) M7 guideline describes the process of hazard identification and risk assessment of impurities that may be present in a drug substance or product [1]. Risk assessment allows the assessment of mutagenicity using quantitative structure-activity relationship (QSAR) tools, which enables the screening of numerous potential impurities. ICH M7 recommends that bacterial mutagenicity should be assessed using two QSAR methodologies, namely, by expert rule-based and statistics-based methodologies. The prediction accuracy of QSAR tools is improved by increasing the number of training sets and modifying algorithms [2,3]. However, the mutagenicity of several structures, such as aromatic amines, is still difficult to predict when using QSAR tools [4,5].
Because aromatic amines are widely used as intermediates in pharmaceutical synthesis and can remain as impurities, their mutagenicity is a serious safety issue [4,5]. In addition, aromatic amines can be formed as metabolites, especially for drugs with amide bonds that may easily break down by enzymatic hydrolysis [6,7].
Predicting the mutagenicity of aromatic amines remains a major challenge for QSAR tools because positive predictions are sometimes false positives. Therefore, an improvement of prediction accuracy is strongly desired from the viewpoint of regulatory science.
In recent years, several efforts have been made to improve the prediction accuracy of QSAR tools. One method is to share nonpublic data held by pharmaceutical companies, as existing QSAR tools have been built based on public knowledge [4,5]. Another method is to perform expert reviews. The SAR fingerprint, a chemical fingerprint of aromatic amine mutagenicity developed by Ahlberg et al. [4], is a useful approach to performing expert reviews [4,[8][9][10]. However, the SAR fingerprint approach is difficult to apply to mutagenicity assessment when activating and deactivating substituents are simultaneously present. We considered whether quantitative predictive indices based on mutagenic mechanisms can complement the SAR fingerprinting approach.
The mechanism underlying mutagenic induction of aromatic amines has been well studied (Fig. 1). In the metabolic activation of aromatic amines, first, hydroxylamine is generated by N-hydroxylation by cytochrome P450 enzymes (CYPs; mainly CYP1A2). The hydroxylamine is either conjugated by phase II enzymes (O-acetyltransferases, N-acetyltransferases, or sulfotransferases) or directly hydrolyzed to nitrenium ions, which form covalent bonds with DNA [11]. Several local QSAR tools have been evaluated in terms of ease of hydroxylamine formation [12,13], ease of nitrenium ion formation [14], and nitrenium ion stability [15][16][17].
These local QSAR tools are useful, but they are not user friendly for genotoxicologists. Therefore, this study developed a local QSAR model based on nitrenium ion stability that would be easy to use for genotoxicologists and would support ICH M7 expert reviews with quantitative indicators.

Data set
Two data sets were used in this study, namely, an inhouse primary aromatic amine data set used for evaluating QSAR tool performance and a data set of 23 known primary aromatic amines. The in-house data set contained 85 aromatic amines, of which 51 were mutagenic, and 34 were nonmutagenic, and they were collected using the following procedure. Primary aromatic amines with standard Ames [18], fluctuation Ames [19], or Vitotox [20] test results were extracted from the in-house compound database. The aromatic amines were classified as follows: mutagenic (clearly positive under any condition in any of the tests) and nonmutagenic (negative in at least three strains (TA98, TA100, TA1537 [or TA2637]) in the standard Ames test (in the absence and presence of S9). We further narrowed down the aromatic amines using six criteria, as described by Bentzien et al. [16]: (i) no formal charge, (ii) molecular weight of < 500 Da, (iii) maximum one stereocenter, (iv) < 10 rotatable bonds, (v) only one aromatic amine functionality, and (vi) no aromatic nitro groups.
The data set of known primary aromatic amines consisted of the following structures: a) aniline as a standard, b) aniline with methyl groups (7 structures), c) aniline with a methoxy group (3 structures), d) aniline with a sulfonate group (3 structures), e) aniline with a sulfonamide group (2 structures), and f) primary aromatic amines for which expert reviews have been reported [4,8,9] and their analogs (7 structures).

Commercial QSAR tools
In this study, the following commercial mutagenicity prediction QSAR tools were used: Derek Nexus, version 6.1.0, 2020.

Calculating the stability of nitrenium ions
Bentzien et al. reported an in silico method of predicting the Ames mutagenicity of primary aromatic amines using the nitrenium ion hypothesis of Ford et al. [15,16]. The mutagenic effect of primary aromatic amines is described as the formation of a reactive nitrenium ion, and the authors concluded that nitrenium ion stability correlates with the mutagenic potential. They calculated nitrenium ion stability using a semi-empirical quantum mechanical method and the relative energy ΔΔE: We used the heat of formation energies of Austin model 1 (AM1) [21] optimized structures calculated using MOPAC v7.1 bundled with Molecular Operation Environment (MOE) 2019.01 software (Chemical Computing Group ULC, Canada). To calculate ΔΔE, we first prepared a structure-data file of the molecule and constructed a 3D structure using the "rebuild 3D" feature of MOE. Then, we designed and performed conformational sampling to accurately calculate the conformational state using LowModeMD [22] with a force field (e.g., MMFF94x) for each neutral molecule within an energy cutoff of 7 kcal/mol. Subsequently, we performed geometry optimization using semi-empirical quantum mechanical calculations with AM1 Hamiltonian for each neutral conformer. The most stable conformer (i.e., with the lowest heat of formation) was selected to determine the nitrenium ion species, one of the amine hydrogens was replaced by a dummy atom X, and geometry optimization was again performed using the keyword CHARGE = +1. The lowest ΔΔE value was adopted for this molecule. The ΔΔE of aniline was set to 0 kcal/mol. If the geometry optimization was not converged, a nota-number (NaN) was assigned for that formula. A compound with a negative ΔΔE was predicted to be mutagenic, and a compound with a positive ΔΔE was predicted to be nonmutagenic. The main output file was in csv format; therefore, we could easily use spreadsheet programs. The csv file also had a remark column for errors, for example, a NaN, an aromatic ring opening during the geometry optimization, or no aromatic amine.
This procedure was written in Scientific Vector Language (SVL), which is integrated in the MOE modeling package and is freely available to MOE users from MOL-SIS Inc. upon request.

Analysis of QSAR tool performance
To evaluate the predictive performance of the QSAR model, we used compounds that are not included in known databases or commercial QSAR training sets. Therefore, the performance of the QSAR tools was evaluated using in-house aromatic amine data only. The performance metrics employed were accuracy, sensitivity, specificity, positive prediction value, negative prediction value, Matthews correlation coefficient (MCC), and coverage [23].

Results
Analysis of QSAR tool performance using in-house aromatic amines Figure 2 shows the ΔΔE of each in-house aromatic amine and Ames test results. The ΔΔE values were small for mutagens and large for nonmutagens. Table 1 shows the predictive performance of commercial QSAR tools and ΔΔE against the in-house aromatic amines.
The prediction accuracy of our in-house aromatic amines using ΔΔE was about 70%, which is comparable to the reported prediction accuracy of commercial QSAR tools for aromatic primary amines. The prediction accuracies of commercial QSAR tools for primary aromatic amines are 59 to 64% for 599 test compounds and can be improved to about 63 to 79% by adding new data sets and revising the rules [4]. For another 268 test compounds, the reported accuracy ranged from 57 to 73% [5].
The prediction accuracies of commercially available QSAR tools for in-house developed aromatic amines were about 45-58%, which are slightly lower than the reported value. The performance of commercial QSAR tools depends on the quantity, quality, and diversity of the training set. Therefore, we considered the possibility of improving the accuracy by training with our proprietary compounds. On the other hand, since the ΔΔE calculation lacks the concept of learning by structure sets, predictions for new structures are not expected to affect the accuracy.
"Quantitative" SAR fingerprint approach Figures 3 show the results of quantitative chemical fingerprinting (termed "SAR fingerprint") of aromatic amines for the representative substituents. According to Ahlberg et al. [4], methyl groups in ortho-, meta-, and para-positions are activating substituents. One methyl group each in the ortho-, meta-, and para-positions decreased ΔΔE. The ΔΔE was lower for aromatic amines with two or three methyl groups, which tested positive (mutagenic) in the Ames test. For the methoxy group, which is an activating substituent, the ΔΔE decreased, except when the methoxy group was in the metaposition, and the aromatic amines with these groups tested positive in the Ames test. Sulfonate and sulfonamide groups, which are considered deactivating substituents, increased the ΔΔE in the ortho-, meta-, and para-positions, and aromatic amines with these groups tested negative in the Ames test.

ICH M7 expert review case studies
The applicability of the ΔΔE method was examined for several compounds for which expert reviews based on ICH M7 have been reported (Table 2).
Case study 1 4-Amino-3-methylbenzenesulfonic acid, which is No. 17 in Table 2, is compound D in the report by Ahlberg et al. [4]. Given that this compound has strong activating (methyl in the ortho-position) and deactivating (sulfonate in the para-position) substituents simultaneously, we opted to use it for quantitative evaluation. In addition, their paper did not specify the results predicted by commercial QSAR tools for this compound, and our study showed conflicting results with those from commercial QSAR tools ( Table 2).
The methyl group in the ortho-position slightly decreased the ΔΔE, whereas the sulfonate group in the para-position caused a significant increase. The ΔΔE of this compound was approximately the sum of the ΔΔE values of aromatic amines with a methyl group in the ortho-position and a sulfonic acid group in the paraposition, and it was predicted to be nonmutagenic and tested negative in the Ames test (Fig. 3, Table 2) [33]. The results suggest that the quantitative index complements the SAR fingerprint approach of aromatic amine mutagenicity developed by Ahlberg et al. [4].     Case study 2 Methyl 2-amino-5-bromobenzoate, which is No. 18 in Table 2, is example 10 in the expert review by Amberg et al. [8]. We used this compound as a case study because their paper reported that commercial QSAR tools give conflicting results. This compound is reported to be nonmutagenic by the expert rule-based model and inconclusive by the statistics-based model. The bromo group in the para-position and carboxylate in the ortho-position increased the ΔΔE, and the compound was predicted to be less mutagenic (Fig. 3, Table 2) and tested negative in the Ames test [34]. The bromo group in the para-position and carboxylate in the ortho-position are reported to be deactivating substituents [4]; thus, the results support the expert review with quantitative indicators.
Case study 3 2-Amino-5-chlorobenzotrifluoride, which is No. 19 in Table 2, is case 5 in the report by Mishima et al. [9]. This compound was also used as a case study because their paper reported conflicting results with a commercial QSAR tool. This compound is reported to be nonmutagenic by the expert rule-based model and mutagenic by the statistics-based model.
The chloro group in the para-position slightly increased the ΔΔE, and the compound was reported to be mutagenic. The compound with a trifluoromethyl group in the ortho-position had a large ΔΔE and was nonmutagenic. The ΔΔE of this compound with both substituents was approximately the sum of the ΔΔE values of aromatic amines with a chloro group in the para-position and a trifluoromethyl group in the ortho-position and was predicted to be nonmutagenic and tested negative in the Ames test (Fig. 3, Table 2) [37]. The chloro group in the para-position and carboxylate in the ortho-position are reported to be deactivating substituents [4]; thus, the results also support the expert review with quantitative indicators.
In the known primary aromatic amine data set, several compounds gave conflicting results in the commercial QSAR tools ( Table 2). The mutagens, namely compounds No. 5, 7, and 8, were accurately predicted in expert rule-based models but were predicted negative, inconclusive, or indeterminate in statistics-based models. These compounds were predicted to be mutagenic with ΔΔE values of −9.8, −7.2, and −15.4 kcal/mol, respectively, supporting the prediction by expert rule-based models. Compounds No. 12 and 17, which are nonmutagenic, were correctly predicted by the expert rule-based models but were predicted as positive in the statisticsbased model. These compounds were predicted to be nonmutagenic with ΔΔE values of 19.6 and 17.4 kcal/ mol, respectively, supporting the prediction by the expert rule-based model. Accurate prediction of the mutagenicity of compounds with ΔΔE values in the range of about ±5 kcal/mol was difficult, suggesting that an appropriate cutoff value should be set [16].

Discussion
Mechanism-based local QSAR tools are useful; however, they are not user friendly for genotoxicologists and have not been standardized. To make such tools widely applicable to safety assessment for industry and regulatory purposes, we developed a shareable procedure in SVL for predicting aromatic amine mutagenicity based on the concept described by Bentzien et al. [16]. The difference is that in our method, ΔΔE is calculated after selecting the most stable conformer by a conformational search using a force field. This difference makes it easier for genotoxicologists to use our method, as they can now perform everything from compound structure preprocessing to ΔΔE calculations using the MOE software.
The evaluation using in-house aromatic amine intermediates showed that our QSAR tool has a mutagenicity prediction accuracy comparable to that of commercial QSAR tools. Given the complex structure of in-house compounds, characterization of all the compounds that were not predicted correctly was difficult. However, the compounds that resulted in false positive predictions had large substituents. The presence of bulky substituents can inhibit the metabolic activation of aromatic amines by CYP1A2 due to steric hindrance. Alternatively, electron-withdrawing substituents can have a resonance effect on aromatic rings, reducing electron density and disrupting the electron distribution necessary for metabolic activation [13]. These factors are not reflected in the calculation of ΔΔE values, which may lead to false positive predictions.
We also investigated the possibility of supporting the SAR fingerprint of aromatic amines [4], which involves a qualitative evaluation. The effect of changing the number and position of substituents on the mutagenicity of aromatic amines is successfully explained by the change in ΔΔE. Furthermore, case studies using several aromatic amines evaluated by the ICH M7 expert review showed that our QSAR tool can support the expert review with quantitative indicators. In particular, when activating and deactivating substituents coexist, as in case study 1, it is difficult to predict mutagenicity by qualitative evaluation; however, we believe that our method can provide important information.
Using in-house compound data set, we examined the cutoff value of ΔΔE in increments of 2.5 kcal/mol between −10 and +10 kcal/mol and found that accuracy and MMC were the highest when the cutoff value was based on the value of +2.5 kcal/mol ( Table 3). The cutoff value may change depending on the validation compound used, and further investigation using more compounds is necessary.
In this paper, we have shown the usefulness of the model focusing on aromatic amines. However, in the mutagenic mechanism of aromatic nitros, the nitro group is reduced by nitroreductase to produce nitrenium ions [38]. Although the potential of our method for the application to aromatic nitros has not been investigated, our method can be possibly applied. This local QSAR model is different from the two complementary QSARs (expert rule-based and statisticsbased) recommended in ICH M7. ICH M7 recommends that analyses using these QSAR models should be performed first, and expert review should be considered when necessary. We hope that this local QSAR model is useful in making decisions during expert reviews.
Further validation using more known and undisclosed aromatic amines may be necessary to clarify the applicability of our method. To facilitate its standardization, the SVL script for the method is provided free of charge to MOE users by MOLSIS Inc. We hope that further refinement of this method will contribute to the standardization of expert reviews on the prediction of mutagenicity of aromatic amines under the ICH M7 guideline.

Conclusions
A shareable MOE SVL script was developed for predicting the mutagenicity of primary aromatic amines based on nitrenium ion stability. This local QSAR tool will be useful as a quantitative support tool to explain the substituent effects on the mutagenicity of primary aromatic amines. By further refinement, this tool can support the ICH M7 expert review with quantitative indicators.