IBSpy analysis to detect Triticum monococcum introgressions into domesticated hexaploid wheat
Hanin Ibrahim Ahmed, Jesus Quiroz-Chavez, Ricardo H. Ramirez-Gonzalez, Cristobal Uauy, Simon G. Krattinger
This dataset was used in the manuscript Einkorn genomics sheds light on evolutionary history of the oldest domesticated wheat. We implemented the Identity-by-State python (IBSpy: https://github.com/Uauy-Lab/IBSpy) pipeline and used it to identify T. monococcum introgressions in the ten hexaploid wheat pangenome cultivars. We used IBSpy and k-mer databases from multiple genotypes, including the Illumina raw data (~10-fold) of 218 T. monococcum accessions, two T. monococcum chromosome-scale assemblies, and ten genome assemblies of wheat (Walkowiak et al., 2020). We counted variations using 50-kbp windows. For details about how IBSpy detects variations, please, read the documentation. To estimate the variations cut-off and sequence identity to detect T. monococcum introgressions, we compared the published pairwise MUMmer alignments (Brinton et al., 2020) of the ten pangenome cultivars (ArinaLrFor, Chinese Spring, Jagger, Julius, LongReach Lancer, CDC Landmark, Mace, Norin 61, Stanley, SY_Mattis) and the two T. monococcum assemblies generated here (TA299 and TA10622) with the corresponding variations counts from IBSpy. In total, there were 110 pairwise alignments analyzed, and we focused on the seven A genome chromosomes.
This data is made available under the
Toronto Agreement
All of the data listed here is available under the prepublication data sharing principle of the
Toronto agreement.
By using this data, you agree to:
- respect the rights of the data producers and contributors to analyze and publish the first
global
analyses and certain other reserved analyses of this data set in a peer-reviewed publication.
- not redistribute, release, or otherwise provide access to the data to anyone outside of the
group, until
the data has been published & submitted to the public data repositories.
- contact the authors to discuss any plans to publish data or analyses that utilize this data to
avoid the
overlap of any planned analyses.
- fully cite the prepublication data along with any applicable versioning details.
- understand that this data as accessed is precompetitive and is not patentable in its present
state.
This agreement does not expire by time but only upon publication of the first global analysis by the
data
producers and contributors.