Additional evidence for the role of chromosomal imbalances and SOX8, ZNRF3 and HHAT gene variants in early human testis development

Background
 Forty-six ,XY Differences/Disorders of Sex Development (DSD) are characterized by a broad phenotypic spectrum ranging from typical female to male with undervirilized external genitalia, or more rarely testicular regression with a typical male phenotype. Despite progress in the genetic diagnosis of DSD, most 46,XY DSD cases remain idiopathic. Methods To determine the genetic causes of 46,XY DSD, we studied 165 patients of Tunisian ancestry, who presented a wide range of DSD phenotypes. Karyotyping, candidate gene sequencing, and whole-exome sequencing (WES) were performed. Results Cytogenetic abnormalities, including a high frequency of sex chromosomal anomalies (85.4%), explained the phenotype in 30.9% (51/165) of the cohort. Sanger sequencing of candidate genes identified a novel pathogenic variant in the SRY gene in a patient with 46,XY gonadal dysgenesis. An exome screen of a sub-group of 44 patients with 46,XY DSD revealed pathogenic or likely pathogenic variants in 38.6% (17/44) of patients. Conclusion Rare or novel pathogenic variants were identified in the AR, SRD5A2, ZNRF3, SOX8, SOX9 and HHAT genes. Overall our data indicate a genetic diagnosis rate of 41.2% (68/165) in the group of 46,XY DSD.


Background
Differences/Disorders of Sex Development (DSDs) are defined as congenital conditions with a discrepancy between chromosomal, gonadal, and phenotypic sex [1]. They represent a major clinical concern that is most often present in newborns or adolescents [2]. The prevalence of DSD is often underestimated since the diagnosis can be relatively late, at puberty or during adulthood and, in some countries, sexual issues are still sensitive, resulting in a reluctance to seek clinical counselling [3]. This may explain why in Saudi Arabia and Egypt, the incidence of ambiguous genitalia is estimated to be 1:2,500 and 1: 3,000 of live births, respectively, whilst in European countries it is estimated at 1: 4,500-1: 5,500 of live births [4][5][6][7].The data could also reflect the high rate of consanguinity, especially in developing countries, where autosomal recessive forms of DSD are more prevalent [8]. Population isolates may also contribute to the presence of rare or novel variants with a limited geographic range [8].
Forty-six ,XY DSD can be due to chromosome abnormalities or genetic variants in the genes involved in the development or function of the male gonad as well as anomalies of downstream target tissues [9]. In most studies, the genetic cause is established in less than 50% of 46,XY DSD cases [1,9,10]. At a molecular level pathogenic variants in the AR, NR5A1, SRD5A2, ZFPM2, HSD17B3 and DHH genes are the most frequent causes of 46,XY DSD [9,10]. The aim of this study was to define the genetic etiology in a large cohort of 46,XY DSD patients from a North African population and compare these data to those observed in other populations. The cytogenetic analysis and molecular gene approaches resulted in a combined diagnosis yield of 41.2% (68/165) for this DSD subgroup. Cytogenetic analysis detected autosomal or sex chromosome anomalies in 30.9% of all cases, whereas WES identified rare or novel variants in the AR, SRD5A2, ZNRF3, SOX8, SOX9 and HHAT genes (17/44 cases; 38.6%). These results emphasize the usefulness of both cytogenetic approaches as well as exome sequencing to make an accurate genetic diagnosis for a better genetic counseling and knowledge-based management of this group of patients.

Cohort and study design
A total of 165 patients with DSD were referred for genetic consultation in the department of Cytogenetic, Molecular biology, and Biology of Human Reproduction, Teaching hospital Farhat Hached, Sousse, Tunisia over a period of 3 years (2018-2020). The local Ethics Board of the University Teaching Hospital Farhat Hached approved the present study (IRB00008931) and written consents were taken from adult probands or from the parents when the patient was under 18 years. The patients presented with a range of clinical DSD profiles and their ages ranged from birth to 35 years. They underwent a complete clinical examinations, including genital examination, family history and examination for the presence of somatic abnormalities. Imaging examination and hormonal evaluation were also carried out according to each case. Patients with suspected or confirmed congenital adrenal hyperplasia (CAH) were excluded from this study. All patients are from Tunisian ancestry.

Genetic analysis Cytogenetic studies
Reverse Heat Giemsa (RHG) banded karyotype was performed on metaphase chromosome preparations obtained from peripheral blood lymphocytes of both patients and parents according to standard protocol (450-550 band level). A minimum of 20 R-banded metaphase chromosomes were analyzed using Cytovision ® Karyotyping software version 4.0. Karyotypes were classified according to the International System of Human Cytogenetic Nomenclature (ISCN 2020) [11]. Fluorescent in situ Hybridization (FISH) was carried out on metaphase chromosomes of the patients according to the standard protocol, using commercial probes. Array Comparative genomic hybridization (aCGH) 4 × 44 K micro-arrays was performed using the Agilent platform according to the manufacturer's instructions (Feature Extraction 9.1, CGH Analytics 4.5, Santa Clara, California, United States). An abnormal ratio greater than + 0.58 or lower than − 0.75 was considered as an alteration. An in silico analysis of the unbalanced regions was executed using UCSC Genome Browser (https:// genome. ucsc. edu/), the Database of Chromosome Imbalance and Phenotype in Humans using Ensemble Resources (DECI-PHER: https:// decip her. sanger. ac. uk/), the Database of Genomic Variants (DGV: http:// dgv. tcag. ca/ dgv/ app/ home) and the Online Mendelian Inheritance in Man database (OMIM: https:// omim. org/).

Sanger sequencing
Genomic DNA was extracted from the peripheral blood of the patient using the FlexiGene DNA Kit (Qiagen, Hilden, Germany). Direct Sanger sequencing was performed using the Big Dye Terminator V3.1 Cycle Sequencing, on the ABI 3730XL sequencer (Applied Biosystems, Foster City, CA, USA). Sequencing data were analyzed by SeqScape 2.0 software (Applied Biosystems).

Whole exome sequencing
The WES approach was performed on DNA from 44 XY individuals who had a complete clinical investigation including examination of genitalia, hormonal screens and, where possible, gonad histology. All of these patients presented with a broad spectrum of 46,XY DSD phenotypes for which the underlying cause is unknown. Exonic and adjacent intronic sequences were enriched from genomic DNA using Agilent SureSelect Human All Exon V4, and paired-end sequencing was done with the TruSeq v3 chemistry on Illumina HiSeq2000 platform. Based on the manufacturer's proprietary software, reads were mapped using the Burrows-Wheeler Aligner. Single nucleotide variants (SNV) and small insertions or deletions (Indels) were generated with GATK 1.6 version. BAM files were also carried out using SAMtools version 0.1.18. GATK. Unified Genotyper software was used for calling single nucleotide polymorphism (SNP) and Indels variants for each patient.
The annotated VCF files were then formatted to be used as a Microsoft Excel spreadsheet software and a selection of variants according to well-defined criteria (degree of pathogenicity, type of variant, frequency of the variant in all populations, including sub population) was performed. Synonymous, intronic and non-coding RNA variants were removed. Missense, nonsense, insertion/deletion and splice-site variants that were homozygous with a Minor Allele Frequency (MAF) of > 0.01 were excluded and heterozygous variants with a MAF of > 0.001 according to the GnomAD database (https:// gnomad. broad insti tute. org/) were also excluded.
The possible impact on protein structure and function was evaluated to determine the pathogenicity of the variants based on individual scores made by Sorting Intolerant from Tolerant (SIFT),Polymorphism phenotyping V2( PolyPhen2) and Rare Exome Variant Ensemble Learner (REVEL; [12]) tools.
The Clustal Omega tool (https:// www. ebi. ac. uk/ Tools/ msa/ clust alo/) was used to generate alignments between three or more protein sequences. The Hope tool was used to analyze the structural effects of a point variation in a protein sequence [13].
Clinical significance was established according to the 2015 American College of Medical Genetics and Genomics and Association for Molecular Pathology (ACMG) in order to establish a better genotype-phenotype correlation. [14]. Potentially pathogenic variants were verified by Sanger sequencing.
The WES cohort of 46,XY DSD consisted of 19 individuals raised as females and 25 raised as males. Within this group, 17 cases were syndromic and 27 cases nonsyndromic cases.

Results
The most common feature at consultation was atypical external genitalia (67 patients) or typical male external genitalia with azoospermia (42 patients). A total of 30 patients presented with other congenital anomalies including intellectual deficiency, dysmorphic features, heart defects growth delay and cerebral anomalies. Primary amenorrhea and delayed puberty were reported in 18 and 8 cases respectively (Table 1). In this cohort, the patients were classified into three groups: Sex chromosome DSD, autosomal chromosomal abnormalities and 46,XY DSD ( Table 1). 66% of the studied patients (110/165) were diagnosed as having 46,XY DSD of whom 23% were raised as females.

Cytogenetic results
The proportions of different categories of DSD are shown in Table 1 and the distribution of patients with sex chromosome DSD in relation to their karyotype is illustrated in Fig. 1. Sex chromosome anomalies were detected in 48/165 patients (29.1%) and autosomal chromosome abnormalities were detected in three individuals (1.8%). Klinefelter syndrome (KS) was the most prevalent chromosome sex abnormality in (87.2%) and genetic cause of azoospermia (83.3%) in males. aCGH was performed on twenty patients based on their clinical presentation, suggesting a contiguous gene syndrome or an a priori assumption of the involvement of rearrangements affecting known gonadal genes or their regulatory sequences with various other extragonadal malformations. In 8 patients, anomalies were observed ( Table 2).
These included five cases with intra-chromosomal deletions, two cases of intra-chromosomal duplications, and one case with an inversion duplication/deletion (invdupdel) chromosome imbalance ( Table 2). Of these chromosomal anomalies, genes known to cause 46,XY DSD were identified for 4 patients (Table 2) including DMRT1, GATA4 and NR0B1. FISH analysis confirmed the heterozygous deletion of the GATA4 gene in patient 2 and a duplication of the NR0B1 gene in patient 3. The patient 1 presented the Wolf-Hirschhorn syndrome (WHS) [OMIM#194190]. In addition to the typical WHS phenotype, he presented a hypospadias, micropenis and cryptorchidism. The 4p16.3 deletion presumably results in haploinsufficiency of the MSX1 gene [OMIM#142983] whose absence might be indirectly responsible for the hypospadias phenotype as this gene contributes to the spatiotemporal regulation of GnRH transcription during development [15]. In three patients (patients 6-8), there was no obvious candidate gene located within the chromosomal anomaly. Clinical details and cytogenetic results are summarized in Table 2.
The most common genetic diagnosis was variants in the androgen receptor (26%, 7/27).
A de novo pathogenic variant (p.S426*) in the AR gene was observed in two sisters who presented complete androgen insensitivity syndrome (CAIS). Two other affected girls with CAIS from unrelated families (DSD3 and DSD4) shared a pathogenic variant (p.G744E), suggesting a possible founder effect. We identified novel or rare likely pathogenic variants in the ZNRF3 (DSD 11), HHAT (DSD 12), and SOX8 genes (DSD 30). A girl with 46,XY complete gonadal dysgenesis carried novel missense heterozygous ZNRF3 variant (p.I338M). According to SIFT (0.01), PP2 (0.519) and REVEL (0.461) scores, this variant is likely to be disease causing. Isoleucine 338 is a highly conserved residue within the long

Discussion
Sex chromosome as well as autosomal anomalies were present in 30.9% of the 46,XY DSD cohort, with the majority classified as 47,XXY Klinefelter's syndrome. This is similar to frequencies reported by Mazen et al., 2021 studying a North African cohort, but higher than those reported in other studies [17]. As suggested by Mazen et al., 2021, this rate may be due to a recruitment bias as the research center in Tunisia is a reference centre for cytogenetics. However, it indicates that a considerable proportion of 46,XY DSD cases is due to chromosome anomalies that can be detected during routine karyotyping. aCGH detected further 8 individuals with chromosomal anomalies, associated with 46,XY DSD in 4 patients. WES is considered the best method for identifying disease causing gene variants in DSD due to the complexity of the phenotypes [18]. Current data indicate that approximately more than half of patients with 46,XY DSD still lack a definite clinical diagnosis at the genetic level after WES [10,19]. In this North African cohort of DSD, the genetic cause was established in 41.2% (68/165) of the total cohort, with a genetic cause identified in 38.6% of patients following WES. Recent cohort studies, using WES rather than targeted NGS panels have given a diagnosis yield in 46,XY DSD cohorts of 43% and 51% respectively [10,20]. The lower yield of 38.6% reported here may reflect the proportion of undervirilised men in the cohort, a group that is difficult to reach a definitive clinical diagnosis or establish a genetic etiology [21,22]. However, similarly to other studies the most common genetic cause was hemizygous variants in the AR [23,24]. A total of 7 individuals, including two sisters, carried pathogenic variants in the AR. The G744E variant was observed in two unrelated patients, suggesting a possible founder effect for this variant.
A proportion of XY males carrying deletions of 8p23.1 that encompasses the GATA4 gene have hypospadias and bilateral cryptorchidism [25,26]. Here, a 46,XY female with atypical external genitalia (micropenis, small palpable right testis) carried a 4 Mb microdeletion in the 8p23.1 encompassing the GATA4 gene [27,28]. Pathogenic variants in GATA4 have been identified in 46,XY DSD with or without cardiac heart defect [27][28][29]. To our knowledge this is the first case with a 8p23 microdeletion in a patient with 46,XY DSD raised as female.
WES revealed several very rare causes of 46,XY DSD including the genes ZNRF3, SOX8 and HHAT. A novel heterozygous missense variant (p.I338M) in ZNRF3 was identified in a 46,XY female with complete gonadal dysgenesis (DSD11). ZNRF3 functions in testis-determination by inhibiting canonical pro-ovary WNT signaling pathway in XY gonads [30]. ZNRF3 does this by targeting Frizzled receptors for degradation by ubiquitination and increased membrane turnover [31]. A total of four rare or novel heterozygous variants (3 missense and one splice region) in ZNRF3 have been reported with both mild and severe 46,XY DSD [30]. All of these variants, including the p.I338M reported here, are located within the C-terminal intracellular domain portion of the protein [31], suggesting a possible genotype/phenotype correlation. SOX8 is an high mobility group (HMG)-box transcription factor, which is co-expressed with SOX9 and NR5A1/SF1 in testis-determination. SOX8 shows functional redundancy with SOX9 and may represses Foxl2 expression [32][33][34]. Heterozygous missense variants in SOX8 are associated with either male or female infertility. Although rearrangements at the SOX8 locus are associated with 46,XY gonadal dysgenesis, only a single pathogenic missense variant, located within the conserved HMG domain (p.E156D), has been demonstrated to cause 46,XY gonadal dysgenesis [35]. Here, a novel heterozygous missense variant p.T226P, located within transactivation (TA) domain, was carried by a 46,XY female with testicular regression syndrome. The p.T226 residue is conserved within the SOXE group of proteins, suggesting a functional role. The mode of inheritance of the ZNRF3 and SOX8 variants mutation is unknown, as the parents were unavailable for study. Hedgehog acyltransferase (HHAT) is an ER-resident multipass membrane protein consisting of 10 transmembrane domains and 2 re-entrant loops [36]. It is a member of the membrane bound-O-acyltransferase (MBOAT) family of enzymes that catalyze the attachment of specific fatty acids to secreted proteins [37]. Hhat −/− mice display severely impaired development of fetal Leydig cells, Sertoli cells and testis cords [16]. In humans, biallelic pathogenic variants in HHAT are very rare and associated with a wide spectrum of neurodevelopmental phenotype including microcephaly, cerebellar vermis hypoplasia, gonadal dysgenesis, seizures and thinning of corpus callosum [16,38,39]. Only four families have been described in the literature and the common features are microcephaly and gonadal dysgenesis. Here, we identified a novel homozygous missense variant (p.R312S) in the conserved MBOAT domain-2 of HHAT carried by a 46,XY female with somatic anomalies including hydrocephalus, agenesis of the corpus callosum, skeletal malformations and bilateral anophtalmia.

Conclusion
A combination of cytogenetics and exome sequencing can explain the genetic cause of 46,XY DSD in just over 40% of all cases. Exome sequencing is particularly useful in detecting very rare genetic causes of DSD in genes such as ZNRF3, SOX8 or HHAT that would otherwise have been difficult to determine using other approaches.