Selection of Grna for Genomic Editing of the Bovine Leucosis Virus Susceptible Alleles of the 2 Exon of the Bola-DRB3 Gene by CRISPR/Cas9

A great spread of the cattle leucosis virus among the livestock of the Russian Federation and the lack of reliable methods of its treatment and prevention led to the creation of animals resistant to the leucosis virus by genetic engineering. In order to create high-tribal cattle resistant to leucosis virus, the genomic editing method CRISPR/Cas9 was used. The work is devoted to the first and main stages of the genomic editing method - the choice of the method of introducing the target modification; the choice of the method of repair of the selected site, the selection of DNA sequencing, optimally suitable for the occurrence of the target mutation; analysis of off-target sites and the choice of guiding gRNA. As a result of the work, using molecular genetic programs, an allele sensitive to the virus was selected, non-targeted sites were identified and analyzed, and gRNA was selected.


Introduction
Genetic engineering is the most promising and developing branch of biological science, able to get ahead of the traditional breeding process. The creation of new animal lines allows getting animals with valuable breeding qualities that are resistant to various diseases at genetic level.
As of March 2020, 67 constituent entities of the Russian Federation including Kemerovo region are unsuccessful in cattle leucosis, which is confirmed by official information from the Federal Service for Veterinary and Phytosanitary Surveillance https://cerberus.vetrf.ru/cerberus/regionalization/pub.
Taking into account the lack of vaccines or other means of treatment and prevention, as well as the ambiguous position of regulatory acts regulating the circulation of raw milk from cows with leucosis virus, the creation of a line of animals resistant to the cattle leucosis virus may become a trend in biotechnology aimed at to fight this infection.
To obtain the target modification to the virus-sensitive allele with the aim of editing it into a stable one, the genetic engineering method -CRISPR/Cas9 was chosen. Using molecular genetic electronic resources, a gRNA that optimally satisfies the requirements was selected and analyzes off-target sites with the largest number of nucleotide mismatches of gRNA was conducted.

Methods
In order to accomplish the tasks, there were used online tools that determine the site for introducing gaps in the genomic sequence of interest and identifying suitable target sites.
Bovine leucosis is a viral disease characterized by malignant proliferation of lymphatic tissue. The causative agent of this disease is the bovine leucosis virus (BLV), which belongs to the family of RNA-containing viruses of the family Retroviridae, genus Deltaretrovirus (Polat et al., 2017).
The development of leucosis in the body is associated with many factors, including the state of the animal's immune system, as well as its predisposition to the disease at the genetic level van Eijk et al., 1992). The primary gene, BoLA-DRB3, the main histocompatibility complex containing 6 exons, namely, exon-2 of the BoLA-DRB3 gene, consisting of 284 bp, is responsible for the body's primary immune response to the introduction of BLV and the recognition of the antigen and the formation of an immune response system for BLV.
Numerous scientists have found that the different exon-2 alleles of this gene are associated with resistance or resistance to BLV Ernst et al., 1997;Juliarena et al., 2008;Mirsky, 1998;Nassiry et al., 2005).
For example, analysis of amino acid sequences showed that alleles (*11, *23, *28) responsible for coding the amino acid sequence of Glu-Arg did not occur in animals infected with BLV.
And animals infected with BLV do not obtain a single allele that encodes the amino acid sequence of ER (glutamic acid and arginine). But animals with signs of the leucosis virus have the *16 allele as the most common. Sick individuals also have alleles *8 and *22. Animals suffering from leucosis are characterized by the presence of the VDTN sequence (valine -aspartic acid -threonine -asparagine) at position 75-78 (alleles *16 and *8), as well as the VDTV sequence (valine -aspartic acid -threonine -valine) (*22 allele) These alleles are associated with the alleles of the BoLA-DRB3*1C gene, for which a correlation with predisposition to BLV was previously shown (Ran et al., 2015).
Numerous works of various research groups around the world on various cattle breeds give grounds to consider it reliable to use *11, *23 and *28 alleles as markers of cattle resistance to leucosis.
Considering the fact that the resistance and susceptibility of cattle to BLV is genetically determined, it is possible to develop a technology for creating cattle resistant to BLV by genetic engineering methods, in particular using the CRISPR/Cas9 genome editing technology. Such an approach will allow not only getting the cattle genetically resistant to leucosis during two or three regenerations, but also avoiding a long narrowly targeted breeding process that helps to reduce genetic diversity in the population, which will negatively affect productivity and resistance to other infections. In addition, there is a negative correlation between milk production and resistance to leucosis virus.
Thus, the technology of creating cattle populations resistant to the leucosis virus by molecular genetics and genomic editing has higher potential than traditional breeding methods.
The CRISPR/Cas9 system is an advanced tool in modern genetic engineering, one of the applications of which is the correction of mutations that cause hereditary diseases, both in vitro and in vivo (Menzorov et al., 2016;Ran et al., 2015;Cong et al., 2013;Cox et al., 2015).
The CRISPR/Cas9 system consists of two main parts. The first is a protein, Cas9 nuclease, which is capable of introducing a double-stranded break in a DNA molecule. The second is a small RNA molecule with a size of about 120 nucleotides -a chimeric guide RNA (gRNA) (Menzorov et al., 2016).
Editing the genome using the CRISPR/Cas9 system has a number of difficulties that arise during the work, including selection of the sequence of the guiding gRNA, analysis of its specificity, and selection of sequences with the least number of off-target sites, because there is a high probability of manifestation of inappropriate effects (Juliarena et al., 2008). Non-target sites differ from the target sequence by a few nucleotides, which lead to undesirable mutations and chromosomal rearrangements. To solve this problem, it is necessary to select the appropriate tools to detect potential non-targeted sites. These tools carry out an experimental evaluation of non-target genome modifications for each specific gRNA and calculate non-target sites.
In order to efficiently transcribe gRNA, it is necessary to observe some rules in the process of selecting it: 1. The optimal sites for introducing double-stranded DNA breaks with the aim of knocking out the gene is a portion of the exon region of DNA located at a distance of about 150-300 bp. after the sart codon (Zakiyan et al., 2018).
2. The first nucleotide located at the 5′-end of the gRNA (from which the synthesis of the RNA molecule will begin) should be guanine. Therefore, it is necessary to select DNA sequences of the type G(N)19 NGG.
3. It is necessary that the 17th nucleotide of the protospacer directly adjacent to the rupture site, and most affecting the outcome of the repair, to be thymine or adenine. In this case, insertion deletion with a shift of the reading frame is more likely to occur (Bae et al., 2014).
4. It is necessary to ensure the elimination of the occurrence of non-target mutations with the effective selection of target ones.
Without the use of a computer and appropriate software, such an analysis is impossible. Moreover, many programs rank selected protospaceisers according to their suitability for editing and eliminating inappropriate mutations, giving them a quantitative assessment (Ernst et al., 1997).

Results
To edit the cattle genome in order to create lines resistant to the leucosis virus, the method of modifying the sensitive allele (*16) of the exon-2 of the BoLA-DBR3 gene using double-stranded rupture followed by non-homologous repair was chosen, because non-homologous restoration of damage generated by Cas9 contributes to the creation of a zero allele ("knockout") in the edited gene, which affects the occurrence of errors in the form of insertions and deletions. Such errors generated during non-homologous repair are usually small (1-10 bp). In non-homologous repair of the gap, the site identical to the amino acid chain of the reverse transcriptase of the leukemia virus loses its identity due to a change in the nucleotide sequence, which entails a shift in the reading frame during transcription and to a change in the amino acid composition during protein biosynthesis or to the formation of a stop codon, at which it stops transcription, in this case, the protein is shortened and not functional (Menzorov et al., 2016).
In order to create cattle lines resistant to leucosis virus using genome editing technologies using the CRISPR/Cas9 system, the BoLA-DBR3 histocompatibility complex gene located on the 23rd chromosome of the cattle genome is optimal, which is confirmed by the NCBI molecular biological resources database (https://www.ncbi.nlm.nih.gov/) (Figure 1).   Priority in areas for gRNA is indicated with colors (green -optimal, red -worse) (Figure 4) (Zakiyan et al., 2018).
Based on the recommended nuances in the selection of gRNA, we selected the sequence number 22, which represents the following nucleotide sequence -GGAGCGGGAGCGGGCCTATGTGG, in which at the 5'-end the guide should have 20 nucleotides adjacent to the PAM site: GGAGCGGGGAGCGGGUCCA ( Figure 5).

Figure 5. Nonspecific sites of selected gRNA
The selected DNA section sequence meets the basic requirements for the guide selection: 1. The first nucleotide located at the 5′-end of the DNA is guanine and corresponds to the formula -G(N)19 NGG.
2. Thymine (T) is the 17th nucleotides of the protospacer directly adjacent to the break, which is likely to affect the insertion deletion with a shift in the reading frame.
3. The indicated site has the site: TGG 4. The indicated sequence has off-target sites in the amount of 6 non-target sites with 3 nucleotide sequence inconsistencies of the selected sequence.

Discussion
The selected allele for editing corresponds to literature data: it is susceptible to the cattle leucosis virus and contains the amino acid motif VDTY, which is found in animals with BLV.
An experiment on introducing a double-stranded break by knockout followed by non-homologous repair of the selected site using the CRISPR/Cas9 method is capable of moving the reading frame with a high probability, followed by the production of a non-functional protein.
The program for the selection of gRNA on the selected sequence does not offer DNA sections that are optimally suited for the introduction of the genetic construct: satisfying all the requirements; with the absence of off-target sites and visualized by the program in green.
The program offers sites highlighted in orange with a minimum number of non-specific sites (no more than 3). When analyzing the areas highlighted in orange, it was found that they have nonspecific sites that are absolutely identical to the selected sequence, which will entail undesirable mutations and such sites are not suitable for editing. In addition, these genome sections did not meet the requirements, compliance with which is highly desirable when selecting gRNA.
There was selected gRNA under serial number 22. The indicated region begins with guanine; the 17th nucleotide is thymine; all non-specific sites contain 3 nucleotide substitutions, which reduce the possibility of transcription of gRNA to off-target sites.

Conclusion
As a result of the work performed, a genomic region was selected for introducing a double-stranded gap with subsequent non-homologous repair of the exon-2 16th allele of the Bola-DRB3 gene.
The selection of gRNA was carried out. To minimize the variants of non-target mutations, molecular biological information resources were used: NCBI, CHOPCHOP.