Overview
Beginner level – genome annotations – orthologs
The Ensembl project provides freely available access to genome annotation datasets including gene, variant and regulatory feature annotation as well as comparative genomics analyses for over 300 vertebrate species and 30'000 non-vertebrate eukaryotes and prokaryotes. All of the data can be retrieved through Ensembl’s online genome browsers (www.ensembl.org, www.ensemblgenomes.org and rapid.ensembl.org) as well as programmatically via Ensembl’s REST API.
This tutorial will introduce you to the range of data available through Ensembl and the concepts of the Ensembl REST API and guide you through the principles of retrieving Ensembl data programmatically using both Python and R.
To participate in the hands-on aspects of this tutorial, including writing and executing REST API queries, you will need to bring a laptop. Tutorial materials, including slides, screenshots, exercises, sample files and solutions will be available before the tutorial and will remain permanently online at the Ensembl training portal: https://training.ensembl.org.
Learning objectives
● Outline the different data types available through Ensembl.
● Identify the appropriate methods for data retrieval from Ensembl.
● Perform queries and extract data returned from the Ensembl REST API.
Schedule
Time
| Activity
|
---|
09:00 – 09:30 | Introduction to the Ensembl genome browser and REST API
|
09:30 – 10:30
| The Ensembl REST API: endpoints and documentation, this will include exploring example scripts and exercises using Jupyter notebooks
|
10:30 – 10:45 | Coffee break
|
10:45 – 12:00 | The Ensembl REST API: writing scripts to retrieve and process Ensembl data programmatically, this will include exploring example scripts and exercises using Jupyter notebooks
|
12:00 – 12:15
| Wrap-up, Q+A and feedback
|
12:15 – 13:00 | Lunch - Join us for lunch, even if you only attend the morning tutorial!
|
Target audience and requirements
Maximum number of participants: 30
The tutorial is aimed at new and existing Ensembl users, from both the wet-lab and bioinformatics communities. The tutorial is designed to provide participants with a greater understanding of the data available through the Ensembl interfaces and how to efficiently retrieve it at various scales. There are no prerequisites for this tutorial, although a basic understanding of programming with Python or R would be beneficial. For the interactive aspects of this tutorial, participants are required to bring their personal laptops. Tutorial materials, including slides, screenshots, exercises, sample files and solutions will be available before the tutorial and will remain permanently online at the Ensembl training portal.
Organizers
Louisse Mirabueno, European Molecular Biology Laboratories - European Bioinformatics Institute (EMBL-EBI), UK
Louisse Mirabueno completed a Bachelor’s degree in Genetics and Cell Biology at Dublin City University, and obtained an MPhil. at the University of Reading. She has worked in both academic institutes and industry. Her experience ranges from genomic analyses in vertebrates and bacterial populations, high-throughput screening in biotech and early-stage drug discovery, and research and development in long-read sequencing. Louisse joined the Ensembl Outreach team in January 2022 and has since delivered 28 in-person and virtual workshops teaching participants how they can integrate Ensembl data in their research.