Genetic spectrum and clinical features in a cohort of Chinese patients with autosomal recessive cerebellar ataxias

Background Although many causative genes have been uncovered in recent years, genetic diagnosis is still missing for approximately 50% of autosomal recessive cerebellar ataxia (ARCA) patients. Few studies have been performed to determine the genetic spectrum and clinical profile of ARCA patients in the Chinese population. Methods Fifty-four Chinese index patients with unexplained autosomal recessive or sporadic ataxia were investigated by whole-exome sequencing (WES) and copy number variation (CNV) calling with ExomeDepth. Likely causal CNV predictions were validated by CNVseq. Results Thirty-eight mutations including 29 novel ones were identified in 25 out of the 54 patients, providing a 46.3% positive molecular diagnostic rate. Ten different genes were involved, of which four most common genes were SACS, SYNE1, ADCK3 and SETX, which accounted for 76.0% (19/25) of the positive cases. The de novo microdeletion in SACS was reported for the first time in China and the uniparental disomy of ADCK3 was reported for the first time worldwide. Clinical features of the patients carrying SACS, SYNE1 and ADCK3 mutations were summarized. Conclusions Our results expand the genetic spectrum and clinical profiles of ARCA patients, demonstrate the high efficiency and reliability of WES combined with CNV analysis in the diagnosis of suspected ARCA, and emphasize the importance of complete bioinformatics analysis of WES data for accurate diagnosis. Supplementary Information The online version contains supplementary material available at 10.1186/s40035-021-00264-z.

long thought to be largely confined to a specific group of French-Canadian populations, are relatively frequent ARCAs distributed around the world [2,7]. Despite the discovery of many disease-causing genes in recent years, the genetic cause of ARCAs remains elusive in more than 50% of affected individuals [2,8,9]. The advent of whole-exome sequencing (WES) technology has enabled efficient diagnosis with single-nucleotide variants (SNVs) and small indels [10], and new WES-based methods, such as ExomeDepth, have enabled detection of structural variation like copy number variations (CNVs) in many hereditary diseases [11]. Moreover, due to the low incidence, few studies have reported the genetic or clinical characteristic of ARCA patients in the Chinese population.
In this study, we performed WES and CNV calling for 54 unrelated Chinese patients with autosomal recessive or sporadic hereditary cerebellar ataxia (HCA) to assess the possibility and prevalence of mutations.

Subjects
All patients were consecutively enrolled from Huashan Hospital of Fudan University and the Second Affiliated Hospital of Zhejiang University School of Medicine between August 9, 2008 and May 5, 2021. The patients were evaluated and diagnosed with HCA based on Harding's criteria [12] by at least two senior neurologists. Informed consent was signed by all participants or their guardians. This study was approved by the Ethics Committees of the above two hospitals.
Fifty-four patients meeting the following inclusion criteria were enrolled in this study: (1) having progressive cerebellar ataxia; (2) being negative for molecular analysis of mitochondrial ataxia or 10 common subtypes of spinocerebellar ataxia (SCA) (including SCA1, 2, 3, 6, 7, 8, 10, 12, 17 and DRPLA) resulting from dynamic mutations or Huntington's disease; (3) exclusion of other identified etiologies, e.g., multiple system atrophy, Niemann-Pick disease, Parkinson's disease, Wilson's disease, multiple sclerosis, viral infection, alcohol or drug intoxication, or paraneoplastic syndrome; (4) autosomal recessive inheritance with the presence of patients in siblings and/or consanguineous union of the parents (n = 12); and (5) sporadic cases with onset before the age of 40 years [5,13] (n = 42). In addition, 1000 unrelated Chinese individuals without a history of ataxia were included as the control.

Genomic DNA extraction and whole-exome sequencing
Genomic DNA was extracted from peripheral blood samples of all patients by using the QIAamp Blood Genome Extraction Kit (Qiagen, Germany) following a standard protocol. Whole-exome Illumina sequencing of DNA was performed according to a detailed protocol described in our previous study [14]. All variants were annotated by ANNOVAR. Two public databases, the 1000 Genomes Project (http:// brows er. 1000g enomes. org) and the Exome Aggregation Consortium (http:// exac. broad insti tute. org/), and our in-house WES database containing 1000 Chinese control individuals, were used to check the frequency of the variants in the general population. Three software programs, SIFT (http:// sift. jcvi. org/), PolyPhen-2 (http:// genet ics. bwh. harva rd. edu/ pph2/) and Mutation Taster (http:// www. mutat ionta ster. org/), were used to predict the possible deleterious effects of mutations. Moreover, these variants were compared with the Human Gene Mutation Database (HGMD, Professional 2021.1, http:// www. hgmd. cf. ac. uk/) to determine whether they were known or novel. Finally, the interpretation and classification of variants was performed based on the American College of Medical Genetics and Genomics (ACMG) standards [15].

CNV analysis
CNVs were called from read-depth of WES data using ExomeDepth algorithm according to the developers' guidelines [16]. For these analyses, each test exome was compared with a set of matched, aggregate reference samples. CNV calls were annotated using AnnotSV [17]. Candidate CNVs were prioritized by exon number, Bayes factor, minor allele frequency and the ratio of observed/ expected number of reads. Candidate CNVs were further proved by CNVseq, which performed a low-coverage WGS strategy and included DNA extraction, interruption, library construction and sequencing by Illumina HiSeq 2500 (Illumina, San Diego, CA). The databases ISCA, DGV, Decipher, OMIM, ClinVar and ClinGen were used to analyze the CNVs. Comprehensive assessments of CNV hazard levels were undertaken based on a frequency database and annotation information according to the ACMG standards and guidelines [18].

Affymetrix CytoScan ® Dx assay
The Affymetrix CytoScan ® Dx Assay was used to identify the uniparental disomy (UPD), which utilizes a high-density combined SNP and comparative genomic hybridization array platform, which assesses approximately 2,696,550 markers, including approximately 750,000 SNP markers. The whole-genome screening and analysis of chromosomal rearrangements was performed by Affymetrix CytoScan ® Dx Assay according to the manufacturer's recommendations [19].

Twenty-nine novel variants were identified in 25 unrelated ARCA patients
A total of 38 variants (Table 1) including 29 novel and nine known variants in 10 genes were identified in 25 out of 54 ARCA patients (chromatograms for novel variants except the chr13:23490196-24866656del are shown in Fig. 1a). All novel variants were absent from or present at extremely low frequency in both public databases and our in-house WES database. The pathogenicity of variants was consistently predicted by different in silico prediction programs. All of the novel missense variants were highly conserved among animal species (Fig. 1b). According to the ACMG standard, 17 out of the 29 novel variants were classified as pathogenic variants, 10 as likely pathogenic variants, and the remaining two (ADCK3: p.R271H, SETX: p.Y2455C) as variants of uncertain significance, whose pathogenicity needed to be confirmed by further functional studies. Six out of the nine known variants (Additional file 1: Fig. S1a) were identified as pathogenic variants and three as likely pathogenic variants.

Identification of a de novo microdeletion and clinical features of patients with SACS mutations
After WES analysis, a novel homozygous mutation (p.K3646fs) was identified in case 1 without family history (Table 2). However, the heterozygous p.K3646fs was only confirmed in his mother and brother but not in his father after family segregation. The parent-child relationship was also established by parenthood analysis (Fig. 2a). Therefore, CNV analysis with ExomeDepth was conducted in the proband and a large deletion was detected in chromosome 13. Then, a trio-copy number variation sequencing (CNVseq) was performed and identified a 1.38 Mb deletion (chr13:23490196-24866656del) in the proband, but the corresponding chromosome in the parents was normal, which means it was a de novo large deletion (Fig. 2b). The chr13:23490196-24866656 in Genome browser includes the entire SACS gene and 5 other genes (Fig. 2c). Thus, a total of nine mutations including eight novel ones in SACS were identified in six patients ( Table 2).
Of the six patients, the mean age of disease onset was 9.5 years (1-39 years), and onset with gait disturbance was all accompanied by sensorimotor neuropathy. Weakness of limbs was present in 4 cases, skeletal abnormality in 5 cases (pes cavus in 4 cases [ Fig. 3a] with 1 accompanied by flexion deformity of fingers [ Fig. 3b], dental abnormalities in 1 case [ Fig. 3c]), spasticity in 3 cases, listening loss in 2 cases, and mental retardation in 1 case. Radiological evaluation of the 3-year-old patient (case 1) showed a normal contract, whereas the other 5 cases showed cerebellum atrophy, thinning of the corpus callosum, bulky pons, bilateral pontine linear hypointense lesions and hyperintensities around the thalamus ( Fig. 3d-f ), and one had craniocerebral dysplasia (case 4). A characteristic retinal finding of case 2 and case 4 was the presence of yellow streaks of hypermyelinated fibers radiating from the edges of the optic disc and retinal nerve fiber hypertrophy, as demonstrated on ocular coherence tomography ( Fig. 3g-k).

Characteristics of patients with SYNE1 mutations
Among the nine novel SYNE1 pathogenic variants found in five cases, four were frameshift, three were nonsense and two were splicing variants. The onset age was before 25 years in four patients and at 53 years (late onset) in one patient. All of the five cases had onset with gait disturbance, and three of them had dysarthria and horizontal nystagmus. Case 7 had the earliest onset with most complex phenotypes including psychiatric symptoms, external ophthalmoplegia and myoclonic jerks (Additional file 2: Video S1). Case 8 with a homozygous mutation (p.E3053fs) appeared to be pure cerebellar ataxia. Both case 9 and case 11 had sensorimotor neuropathy, and case 9 was also accompanied by tremor, dizziness and pes cavus. Case 10 had a 15-year history of sensorineural hearing loss and mild ataxia. She also exhibited two pathogenic mutations (Additional file 1: Fig. S1b) including c.6149-3T > G and c.1898dupA (p.E633fs) in PTPRQ (NM_001145026.1). After pedigree verification, her brother affected by impaired hearing but without ataxia, was found to carry these two mutations in PTPRQ  but only one heterozygous mutation in SYNE1, and all mutations in SYNE1 and PTPRQ were derived independently from two parents with normal phenotypes (Additional file 1: Fig. S1c). Owing to the lack of magnetic resonance imaging report of case 8, the remaining four probands were estimated to have a varied degree of cerebellar atrophy.

Identification of a rare UPD and clinical features of patients with ADCK3 mutations
After WES analysis, six ADCK3 mutations including three novel ones (p.R271H, p.L320fs and p.R598H) were identified in four males with no family history of disease and one female with an affected sister. The homozygous variant (p.R410X) within ADCK3 was identified in case 12 (Table 2). However, the heterozygous p.R410X was only confirmed in his mother but not in his father, and the parental analysis was normal (Fig. 4a). The CNV analysis of the index patient was normal, too. Thus, UPD was considered in this situation.  (Fig. 4b), which included the whole ADCK3 gene (Fig. 4c).
In-house data: n = 2000. The impact of non-synonymous protein-coding region variants was determined using prediction software including SIFT, PolyPhen-2 and Mutation Taste

Discussion
To date, few studies have been conducted to investigate all causative genes and the clinical features of ARCA in the Chinese population. In this study, the prevalence of ARCA was systematically investigated in 54 unrelated autosomal recessive/sporadic ataxia patients by WES analysis and CNV calling, which is the largest cohort in China to date. Thirty-eight mutations, including 29 novel mutations in 10 genes related to ARCA, were identified in 25 unrelated patients. Among them, the de novo microdeletion in SACS was reported for the first time in the Chinese population and the UPD of ADCK3 was reported for the first time worldwide. In this study, the diagnostic yield was 46.3% (25 of 54 patients). This rate is at a relatively high level compared to those previously reported using exome sequencing [5,[22][23][24][25][26][27]. Marelli et al. also performed a mini-exome and read-depth-based CNV analysis in 33 ataxic patients and identified pathogenic variants in 14 cases (42%) including CNV in 2 patients [28]. The differences in diagnostic rate were most likely due to the differences in source populations (ethnic and geographic origin), sample size, the inclusion criteria used and study methodology. One of the major highlights of the present study is the combination of WES and CNV analysis, and the results truly demonstrate that structural variation might not be extremely rare in ARCA. With the development of high throughput sequencing technology and bioinformatics algorithms, many established approaches for CNV Fig. 2 A de novo large deletion in a patient with ARSACS. a The pedigree of case 1 shows segregation of p.K3646fs in SACS and chr13:23490196-24866656del. The analysis of the repeat numbers of 21 core short tandem repeat loci in the four participants showed that the probability of the patient being the alleged parents' biological son was 99.99%. Open symbol: unaffected; filled symbol: affected; square: male; circle: female; arrow: proband of the family. Symbol with "+/+" indicates patient. Symbols with "+/−" indicate mutation carrier. b CNVseq of the proband and his parents showed that chr13:23490196-24866656del was a de novo mutation, as indicated by the blue box. c Schematic diagram of the known genes in this deletion region (reference human Genome Build GRCh37, UCSC Assembly hg19) Fig. 3 Clinical features of patients with ARSACS. a-c The special phenotypes of ARSACS: pes cavus (a), flexion deformity of fingers in case 4 (b) and abnormality of dentition in case 1 (c). d-f Classic brain magnetic resonance images in case 2. d Sagittal T2 sequence shows thinning of the corpus callosum (red arrow), superior vermian atrophy (yellow arrow), and bulky pons (blue arrow); e axial T2 shows bilateral pontine linear hypointense lesions (arrow); f axial T2 shows hyperintensities around the thalamus (arrows). g-k Typical retinal findings in case 4: Fundus photographs of the right (g) and left eyes (j) show yellow streaks of hypermyelinated fibers radiating from the edges of the optic disc; ocular coherence tomography imaging of the right (h) and left eyes (i) show thickened retinal nerve fiber layer (RNFL) (the yellow areas); k statistical graph of OCT showing thickening of the RNFL. The black line indicates oculus dexter (OD), that is, the right eye; the dotted line indicates oculus sinister (OS), the left eye. The green band represents the 5%-95% range of the normative data. (Quadrants: TEMP temporal, SUP superior, NAS nasal, INF inferior) calling in WES data have been available and easily usable for biologists and geneticists to test structural variations, such as ExomeDepth, CovCopCan, IonCopy, DeviCNV, and Cov'Cop [29]. Thus, we suggest that tools for detecting structural variations such as CNVs should be used routinely for NGS data analysis in order to increase the rate of positive diagnosis.
In this study, 12 autosomal recessive ataxia and 42 sporadic ataxia families were included. Mutations in ARCA causative genes were identified in eight autosomal recessive ataxia and 17 sporadic ataxia families. Combined with our previous studies, the number of ARCA families with definite diagnosis in our center was 27. Further, we demonstrate that ARSACS (gene: SACS; n = 6, 22.2%), SCAR8 (gene: SYNE1; n = 5, 18.5%), SCAR9 (gene: ADCK3; n = 5, 18.5%) and AOA2 (gene: SETX; n = 4, 14.8%) are the most common recessive ataxia subtypes in the Chinese population. In a previously reported Chinese ARCA cohort, the ARCA-causing genes were identified in 19 out of 26 probands, including AOA2 (n = 4, 15.4%), Niemann-Pick disease (n = 3, 11.5%), one ARSACS and one SCAR8 [27]. SACS and SYNE1 mutations have been observed mainly in Quebec and Canada, where ARSACS and SCAR8 are the second and third most common hereditary ataxia, respectively [7]. FRDA has been reported as the most frequent ARCA in Caucasians but is much rare in Chinese [30], thus the identification of dynamic mutations about FRDA was not undertaken in our study. Although the aetiology of ARCA in Chinese is different from the reported patterns in Caucasians presumably due to the different genetic backgrounds and ethnicities, there are still some similarities which may contribute to a better understanding of the epidemiology and mechanism of ARCA.
In total, 14 ARSACS patients from 10 families have been reported in China (Additional file 4: Table S2), including the 6 probands in this study, confirming that nonsense or frameshift mutations in SACS are the most common genetic cause in Chinese patients. In our study, a majority of those patients who had at least one truncation variant appeared to have a typical ARSACS clinical presentation with childhood onset of symptoms. But one patient who harbored a homozygous non-frameshift deletion variant (p.3758_3759del), exhibited an atypical disease presentation with an absence of spasticity or pyramidal signs and onset in adulthood, and this same variant has been reported in two heterozygous patients, both with early onset (1 and 13 years old) and typical triad symptoms [31,32]. Thus, whether the truncation variant in SACS is linked to a typical clinical Fig. 4 A maternal uniparental disomy in a patient with SCAR9. a The pedigree of case 12 shows the segregation of p.R410X in ADCK3. The analysis of the repeat numbers of 21 core short tandem repeat loci in the three participants showed that the probability of the patient being the alleged parents' biological son was 99.99%. b Uniparental disomy of chromosomes 1p and 1q detected by Affymetrix CytoScan ® Dx Assay in the patient. c Schematic diagram of ADCK3 in this UPD region (1pterp36.11 and 1q42.12qter; reference human Genome Build GRCh37, UCSC Assembly hg19) manifestation of ARSACS is an issue that requires further exploration. Even this de novo large deletion containing SACS was the second report in the world [33], the CNV in SACS has already been reported in many populations including Belgian, French, Italian, Canadian, German and Chinese [33][34][35][36][37][38][39]. Therefore, presence of CNV must be considered if no or only one heterozygous mutation had been identified in those patients with an ARSACS phenotype suggested by means of clinical presumption or auxiliary examination.
Defects in SYNE1 are associated with adult-onset, slowly progressive, relatively pure cerebellar ataxia with only a few extracerebellar symptoms (SCAR8), and almost all reported variants that cause this phenotype are protein truncations [7]. Previous studies showed that SYNE1 ataxia accounted for 5.3% (23/434), 6% (7/116), and 10.26% (4/39) of recessive and sporadic ataxia patients in two European combined cohorts and one Brazilian cohort [7,40,41]. In the present study, 9.3% (5/54) of ataxia patients had biallelic truncating variants in SYNE1, demonstrating that SCAR8 is also a common subtype of recessive ataxia in China. Two SCAR8 patients (cases 8 and 10) both had symptoms of pure cerebellar ataxia, while the hearing loss of case 10 was caused by mutations in PTPRQ. Moreover, the remaining three patients exhibited variable additional extracerebellar neurological symptoms (peripheral polyneuropathy, mental retardation, dizziness, pes cavus, external ophthalmoplegia, myoclonic jerks) and non-neurologic dysfunctions (psychiatric symptoms). The reported nine SCAR8 Chinese patients included three presenting with pure cerebellar ataxia and six presenting with variable ataxia syndrome [27,42,43]. Thus, in the Chinese population, pure cerebellar ataxia only accounted for 35.7% (5/14) of SCAR8 cases, while the other 64.3% (9/14) of patients showed complex ataxia phenotypes with a wide range of noncerebellar abnormalities. This further supports the concept that SYNE1 ataxia is a multisystemic neurodegenerative disease, as proposed by Synofzik et al. [7].
A total of 65 pathogenic mutations in ADCK3 have been reported around the world (HGMD, Professional 2021.1), and SCAR9 has also been reported as a common subtype of recessive ataxia [26]. However, cases of SCAR9 were rarely reported among the Chinese population before [44]. Here, we report for the first time that SCAR9 also had a high frequency in China. The homozygous p.S616fs in ADCK3 has been reported in two siblings from a consanguineous family of Pakistani origin, and both siblings presented with cerebellar ataxia, myoclonus, tremor and dysarthria at age of 10 and 14, respectively [45]. However, in our study, two patients harboring p.S616fs in heterozygous or homozygous form both presented with prominent tremor and mild ataxia symptoms, but onset at adolescence and adulthood, respectively. Moreover, the homozygous patient presented with additional dysphagia and peripheral neuropathy, and his sister was identified to have the mutation but was still asymptomatic at the age of 24. Thus, the clinical presentation of SCAR9 may be highly variable, even in patients with the same mutation from one family. Our study also proved that supplementation with COQ10 is significantly helpful for SCAR9 patients, even though this therapy has different curative effects in several studies [45][46][47]. The UPD of ADCK3 identified in our study is the first report worldwide, which not only enriches the genotypic spectrum of SCAR9, but also emphasizes the importance of a detailed analysis of family segregation.
The absence of diagnosis for 29 patients in the cohort may be explained by the following reasons. First, it is difficult for exome capture to fully cover all coding regions of the genome, especially regions rich with GC. Second, large genomic rearrangements and trinucleotide expansions cannot be reliably detected from exome-capture data, even though there are some CNV-detecting tools developed, and based on the read-depth of NGS data, such as ExomeDepth used in this study, we cannot easily detect inversion or translocation [34]. Third, it is also likely that some causal variants are outside the coding regions and adjacent splice sites [48]. Finally, insights into the functional consequences of the variants are missing, as some synonymous mutations may be causative, too [49]. Some of these issues can be addressed by whole-genome sequencing, which is, however, expensive and needs complex bioinformatics analyses. In addition, mutations in yet unknown genes of ARCA may play a key role in these unclear disorders.

Conclusions
In summary, this is so far the largest WES analysis and CNV calling study to explore the genetic background and describe clinical characteristics of ARCA in a Chinese population. We identified 38 mutations including 29 novel ones in 25 unrelated Chinese ARCA patients. We also reported for the first time the UPD of ADCK3 in the world and the de novo microdeletion of SACS in China. Our results expand the genetic spectrum and clinical profiles of ARCA patients, demonstrate the high efficiency and reliability of WES combined with CNV analysis in diagnosing suspected ARCA, and emphasize the importance of complete bioinformatics analysis of WES data in making an accurate diagnosis. Further functional studies will help to determine the pathogenicity of