Tutorials & workshops

Tutorials
Workshops
Evaluation committee
Rules and responsibilities
Open and FAIR

The [BC]²Basel Computational Biology Conference 2023 will feature a day of tutorials & workshops on Monday 11 September 2023.

Key dates

~~15 November 2022 - 15 January 2023 - Call for tutorials & workshops~~
~~1 February 2023 - Acceptance notification~~
~~28 February 2023 - Programme outline due~~
15 March 2023 - Registrations open
30 May 2023 - Final and detailed schedule due (incl. name of presenters)
11 September 2023 - Presentation at [BC]²from 9:00 - 16:00 at Biozentrum Basel (new building).

[BC]²tutorials and workshops provide an informal setting to learn about the latest bioinformatics methods, discuss technical issues, exchange research ideas, and share practical experiences on focused or emerging topics in bioinformatics.

General schedule

Time	Activity
08:15 – 09:00	Registration
09:00 – 10:30	Tutorials & Workshops
10:30 – 10:45	Coffee break
10:45 – 12:15	Tutorials & Workshops
12:15 – 13:00	Lunch break
13:00 – 14:30	Tutorials & Workshops
14:30 - 14:45	Coffee break
14:45 - 16:15	Tutorials & Workshops
17:00	[BC]² Welcome lecture

Please note that you need to register for your tutorial or workshop of choice via the online registration system (once registrations are open), they are not included in the [BC]² registration fee. Tutorials and workshops have a limited number of participants they can take - register on time!

Tutorials

Tutorials aim to provide participants with lectures or practical training covering topics relevant to the bioinformatics field. It offers participants an opportunity to learn about new areas of bioinformatics research, to get an introduction to important concepts, or to develop advanced skills in areas they are already familiar with. Each tutorial will be organized as a full day (9:00 - 16:15) or a half-day event (9:00 - 12:15 or 13:00 - 16:15). If you choose a half day tutorial, you are welcome to register for another half day tutorial if space is available.

All tutorials take place at the Biozentrum -the new building- (University of Basel, Spitalstrasse 41, 4056 Basel).

Single-Cell Transcriptomics: get started! (full day)

[FULLY BOOKED] Overview

beginner level – single-cell transcriptomic – single-cell analysis –– RNA Sequencing (scRNA-Veq)

Single cell RNA sequencing (scRNA-seq) has emerged as a revolutionary tool to measure the gene expression of complex biological systems at the level of individual cells. Its rapid development has facilitated the identification of previously unknown cell types in heterogeneous populations, empowering scientists to generate detailed tissue atlases describing the transcriptomic profiles of thousands or even millions of cells.

Although the single cell genomics field continues to rapidly develop, the lack of expertise on handling scRNA-seq datasets has restricted its use among the larger scientific community. This includes bioinformaticians who are only familiar with bulk sequencing data. The aim of this course is to enable researchers to start applying the fundamental scRNA-seq analysis pipeline to data generated in their labs. We will outline how to interpret results of a scRNA-seq dataset and, in teams, demonstrate the basics of data preprocessing and analysis using scanpy in Python. We will discuss common concerns in the field, including preprocessing choices, dimensionality reduction, cell type clustering and identification, batch effect correction, and pseudotime analysis. By the end of the tutorial, participants will be able to run scanpy analysis on their own datasets as well as understand how to overcome potential bottlenecks.

Learning objectives

The attendees of this tutorial will learn how to:

Evaluate the quality of an scRNA-seq experiment
Perform normalization, feature selection, dimensionality reduction and clustering with scanpy to identify cell types and differentially expressed genes
Understand preprocessing choices typically made during single cell analysis
Discuss and apply temporal methods such as pseudotime and RNA velocity

Schedule

Time	Activity
09:00 - 09:30	Welcome and brief presentation: Single Cell Transcriptomics, an expanding field
09:30 – 10:00	Interactive session: interpreting 10X Genomics Cell Ranger output How do I know if my scRNA-seq experiment was any good?
10:00 – 10:30	Introduction to scRNA-seq analysis in Python using scanpy Walkthrough the analysis of a real dataset
10:30 – 10:45	Coffee break
10:45 – 12:15	Team assignment: hands-on data analysis for single cell/nucleus RNA-seq Preprocessing choices: filtering cells and genes Regressing out covariates: cell cycle, batch effects Feature selection and dimensionality reduction with PCA, tSNE, and UMAP
12:15 – 13:00	Lunch break
13:00 – 14:30	Team assignment: hands-on data analysis for single cell/nuclear RNA-seq Clustering analysis and cell type identification Differentiation gene expression Refining cluster selection to maximize the biological interpretation of your data
14:30 – 14:45	Coffee break
14:45 – 15:45	Interactive session: extracting temporal information from single cell snapshots What is pseudotime trajectory inference? When should I perform RNA velocity analysis?
16:00 – 16:15	Concluding remarks

Audience and requirements

Maximum number of participants: 25

The target audience will be students and researchers who plan on using single cell RNA sequencing in their own projects, or who are interested in learning how to analyze such data but are unfamiliar with the approaches and pitfalls of single cell analysis. It is preferred if participants have some previous exposure to a programming language (such as Python), but no experience with single cell data analysis itself is required and attendees do not need to be frequent programmers.

Attendees will be required to bring their laptop to the tutorial as well as have completed the installation instructions for Python, Anaconda, and Jupyter notebooks that we will provide a couple weeks in advance of the tutorial data

Organizers

Alex Lederer, PhD student in the laboratory of Neurodevelopmental Systems Biology with Dr. Gioele La Manno at EPFL, Switzerland
Alexandre Coudray, PhD student in the Laboratory of Virology and Genetics with Prof. Didier Trono at EPFL, Switzerland

Introduction to Spatial Transcriptomics Data Analysis (full day)

Overview

beginner level – Spatial Transcriptomics – Single-cell RNA sequencing

Single-cell RNA sequencing (scRNA-seq) technologies enable to define the gene expression profiles of single cells allowing the characterisation of heterogeneous cell populations. However, scRNA-seqlacks the spatial context as the tissue must be dissociated. To overcome this issue, spatial transcriptomics (ST) is a new and evolving technology that measures transcriptomic information while preserving spatial information. Consequently, ST can unravel the complexity of biology, especially the molecular mechanisms behind diseases that cannot be understood by other technologies.

There are different ST techniques with different technical parameters, and there is often a trade-off between the number of genes profiled and the efficiency of the technique. In this tutorial, we will present an overview of the different currently available methods to perform ST, their strengths and drawbacks. The focus will be on the two main methods: spot-based (10x Visium) and in-situ (10x Xenium) technologies. Spot-based technology allows the detection of the whole transcriptome across the entire slice of tissue with high sensitivity and specificity, but it's limited to capture spot resolution. In contrast, the newest in-situ technology Xenium released at the end of 2022 allows us to have an extremely high resolution at subcellular level, but it can be biased due to the requirement of pre-selected gene targets. These two technologies can be complementary as Xenium data can be used to interpolate the cell composition of Visium spots and leverage Visium whole transcriptomics to understand tissue heterogeneity and discover new biomarkers.

Therefore, in this tutorial, we will use computational tools to analyse single-cell RNA-seq, Visium and Xenium data to demonstrate how whole transcriptome and targeted in-situ data can be integrated to provide highly complementary and additive biological information. The right balance between lectures and practical training will provide entry-level guidance for participants interested in ST data analysis.

Learning objectives

At the end of this tutorial, attendees will understand how the different ST technologies work, and the advantages and limitations of each of the methods. This tutorial will provide the participants with theory and practical training to analyse ST data (Visium and Xenium), as well as a guide to the most recent computational tools to perform more advanced ST analysis. With the provided guidance of the instructors, participants will be able to independently analyse different ST data.

Schedule

Time	Activity
09:00 – 10:10	Introduction, motivation and theoretical background on spatial transcriptomics
10:10 – 10:30	Hands-on session: loading and understanding ST data
10:30 – 10:45	Coffee break
10:45 – 11:00	Theoretical background on differences between spot-based and in-situ ST technology
11:00 – 12:15	Hands-on session (10x Xenium): dimensionality reduction, marker genes
12:15 – 13:00	Lunch break
13:00 – 14:00	Hands-on session (10x Visium): spot-level deconvolution
14:00 – 14:30	Theoretical and hands-on session: 10x Xenium integration with 10x Visium (part 1)
14:30 – 14:45	Coffee break
14:45 – 15:45	Theoretical and hands-on session: 10x Xenium integration with 10x Visium (part 2)
15:45 – 16:15	Final discussions

Audience and requirements

Maximum number of participants: 30

This tutorial is designed for individuals interested in spatial transcriptomics data analysis. Participants must possess a basic understanding of R and RNA-seq gene expression method and data format. Participants should bring a laptop with Wi-Fi, the latest versions of R, Rstudio, and relevant R packages installed. Details on package installation will be sent via email before the tutorial.

Organizers

Ana Cristina Guerra de Souza, PhD, Bioinformatician at the Translational Data Science Facility (SIB), Switzerland
Tania Wyss, PhD, Bioinformatician at Dept of Fundamental Oncology (UNIL) and at the Translational Data Science Facility (SIB), Switzerland

SQL for data science (full day)

Overview

beginner level – Spatial Transcriptomics – Single-cell RNA sequencing

There are several reasons why learning or improving SQL (Structured Query Language) can be beneficial for data science:
1. SQL is the standard language for working with relational databases, which are a common source of data for data science projects. By learning SQL, you'll be able to easily retrieve, manipulate, and analyze the data stored in these databases.
2. SQL allows you to efficiently work with large datasets. When working with data science projects, it is common to deal with datasets that are too large to fit into memory. SQL provides powerful tools for filtering, grouping, and aggregating large datasets, so you can work with subsets of the data that are small enough to fit into memory.
3. SQL can help you to understand the underlying data better. Even though SQL is a programming language, it is declarative, unlike most of the general-purpose programming languages like Python, C++ or Java. It means the code you write will tell the SQL engine what you want and the engine will take care of the HOW to get the results, so it will help to get more understanding about the dataset, its structure and its constraints
4. SQL is a good tool for data preparation, cleaning and validation. Since it is a powerful tool to manipulate and filter the data, you could use it to prepare your dataset to a better shape before applying any statistical or machine learning models.
5. SQL can be a valuable skill in the job market. Many companies store data in relational databases, and SQL knowledge is often a required or preferred skill for data science positions.

In summary, learning SQL can be a valuable addition to your data science toolkit, as it allows you to efficiently work with large datasets and can be a valuable skill in the job market. Take a look at the results of this survey.

Schedule

Time	Activity
09:00 – 09:10	What is SQL?
09:10 – 09:20	Data models: structure and content
09:20 – 09:30	Datatypes in SQL
09:30 – 09:45	Retrieve data with SELECT
09:45 – 10:00	Filtering with SQL
10:00 – 10:15	Advanced filtering: IN, OR, AND and NOT
10:15 – 10:30	Coffee break
10:45 – 11:00	Wildcards
11:00 – 11:15	Ordering your results with ORDER BY
11:15 – 11:45	Aggregate functions: AVG, COUNT, MIN, MAX and SUM
11:45 – 12:15	Grouping your results with GROUP BY
12:15 – 13:00	Lunch break
13:00 – 13:15	Subqueries
13:15 – 13:30	Introduction to joining tables
13:30 – 13:40	Cross joins (optionnal)
13:40 – 14:05	Inner Joins
14:05 – 14:30	Left, right and outer joins
14:30 – 14:45	Coffee break
14:45 – 15:15	Set operations with UNION, INTERSECT and EXCEPT
15:15 – 15:35	Date and Time
15:35 – 15:55	Views
15:55 – 16:15	Conclusions

Audience and requirements

Maximum number of participants: 20

This tutorial is designed for data scientists, programmers, bioinformaticians or students. Participants should bring a laptop and install the DBMS and download the database prior to the tutorial (still under discussion).

Organizers

Dillenn Terumalai, Software developer at SIB clinical bioinformatics, Switzerland
Florent Tassy, Data manager and software developer at SIB clinical bioinformatics, Switzerland

Make your research FAIRer with Quarto, GitHub and Zenodo (full day)

Overview

Intermediary level – FAIR principles

The FAIR (Findable, Accessible, Interoperable and Reusable) principles provide guidelines for making research data and other resources more easily discoverable and reusable, which can help increase your research's impact and exposure. Adhering to these principles also ensures that your research is more reliable and reproducible, as others can more easily access and provide feedback. In addition, making your research FAIR can promote the principles of open science and make it easier for others to contribute to and build upon your work. Finally, many funding agencies and journals now expect that your research outputs be made FAIR as a condition of funding or publication, so adhering to these principles can help to ensure that your research meets these requirements.

Sharing and reusing data, software, and documentation is essential to the FAIR principles and should be a routine task for scientists. Researchers can achieve this by providing all necessary information (data, software and parameters used, scripts for the analysis, databases and their versions) and using markdown to create a single file that can be easily shared as a web page.

Learning objectives

In this tutorial, participants will learn tools and concepts to take significant steps towards adhering to the FAIR principles, which will enhance the benefits of sharing and collaboration in research. More specifically, they will learn:

Create notebooks and websites based on Markdown, and Python or R with Quarto
Use Git and GitHub to version control the generated content
Host a website by making use of GitHub actions and GitHub pages
Link the GitHub repository to Zenodo and give it a unique identiﬁer (DOI)

Schedule

Time	Activity
09:00 – 10:30	Creating notebooks in Markdown with Quarto
10:30 – 10:45	Coffee break
10:45 – 12:15	Basics of version control with git and GitHub
12:15 – 13:00	Lunch break
13:00 – 14:30	Using GitHub pages to host a website created with Quarto
14:30 – 14:45	Coffee break
14:45 - 16:15	Introduction to Zenodo and how to link GitHub repositories to Zenodo for long-term storage and DOI

Audience and requirements

Maximum number of participants: 30

This tutorial is aimed at computational biologists, bioinformaticians, researchers, scientists and trainers working in the life sciences who want to learn how to make their research and training FAIRer with reproducible notebooks and websites. Participants are expected to have an introductory level in programming with R or Python. Participants should have a GitHub account and bring their laptops with either the latest versions of RStudio or VSCode pre-installed.

Organizers

Geert van Geest, Bioinformatics Trainer & Support Specialist (SIB), Bioinformatician and data analyst at Interfaculty Bioinformatics Unit of the University of Bern, Switzerland
Wandrille Duchemin, Bioinformatics Trainer & Computational Biologist (SIB), Bioinformatician and data analyst at the sciCORE core facility of the University of Basel, Switzerland

Omnibenchmark: open continuous community-driven benchmarking of computational methods (1/2 day - AM)

Overview

Intermediate level –– Method benchmarking –– benchmarks –– omics datasets

The rise of large-scale omics datasets has led to a growing number of methods to model and interpret them, making it hard to identify performant computational methods to use in discovery research. Method benchmarking is critical to dissect important steps of an analysis pipeline, formally assess performance across common situations and edge cases, and ultimately guide users. While some form of comparison is standard practice in method development, current approaches have several limitations (Sonrel et al. 2022). One main issue is that benchmarks are not easily extensible and with the constant emergence of new approaches, this often leads to rapidly outdated or even contradicting and irreproducible conclusions. We believe that if benchmarks were organized in a systematic way and conducted at a higher standard, they will have a considerably greater impact in guiding best practice in employing computational methods for discovery research.

To address these issues, we created Omnibenchmark, a modular and extensible framework based on the free open-source platform renku. The framework connects data, method, and metric modules via a knowledge graph, which tracks relevant metadata (e.g. software environment, parameters and commands). Results can be viewed in a dashboard or openly accessed, and new modules can be added easily with pre-configured templates. To facilitate community contributions for adding new reference datasets, evaluation metrics or computational methods, we maintain a series of pre-configured templates. Each element of Omnibenchmark is packaged with all dependencies and can be inspected, re-used, modified, and integrated onto other platforms in compliance with the FAIR principles.

Learning objectives

At the end of the tutorial, participants should be able to:

understand components of the Omnibenchmark framework
submit new reference datasets, methods or metrics modules
visualize and re-use output results and performance summaries

Schedule

Time	Activity
09:00 – 09:30	Introduction to Omnibenchmark framework
09:30 – 10:30	Group creation & first hands-on session following our step-by-step guide
10:30 – 10:45	Coffee break
10:45 - 11:15	Technical aspects of Omnibenchmark module creation
11:15 – 12:00	Second hands-on session
12:00 – 12:15	Q&A and Closing remarks

12:15 – 13:00	Lunch - Join us for lunch, even if you only attend the morning workshop!

Audience and requirements

Maximum number of participants: 20

This tutorial is addressed to computational-minded researchers who are tasked with benchmarking methods that they are familiar with (or that they developed) as well as for benchmarkers that would like to port their work to our system.

Participants are expected to have :
- Intermediate knowledge of R or Python
- Basic knowledge of UNIX and version control systems (Git)
- Personal laptop with Wifi connection
- Code for method(s) and/ or benchmark(s)- (optional)

Organizers

Main presenters: Almut Lütge and Anthony Sonrel, PhD Students and co-developers of Omnibenchmark, Statistical Bioinformatics Group at the University of Zurich, Switzerland
Technical collaborators and knowledge exchange: Dr. Izaskun Mallona, Statistical Bioinformatics Group at the University of Zurich (co-developer of Omnibenchmark); Dr. Charlotte Soneson, Research Associate, Computational Biology Platform, FMI and SIB Swiss Institute of Bioinformatics, Switzerland
Introductions and support: Mark D. Robinson, Associate Professor, University of Zurich and SIB Group Leader, Switzerland

Using the Ensembl REST API to retrieve genome annotation data (1/2 day - AM)

Overview

Beginner level – genome annotations – orthologs

The Ensembl project provides freely available access to genome annotation datasets including gene, variant and regulatory feature annotation as well as comparative genomics analyses for over 300 vertebrate species and 30'000 non-vertebrate eukaryotes and prokaryotes. All of the data can be retrieved through Ensembl’s online genome browsers (www.ensembl.org, www.ensemblgenomes.org and rapid.ensembl.org) as well as programmatically via Ensembl’s REST API.

This tutorial will introduce you to the range of data available through Ensembl and the concepts of the Ensembl REST API and guide you through the principles of retrieving Ensembl data programmatically using both Python and R.

To participate in the hands-on aspects of this tutorial, including writing and executing REST API queries, you will need to bring a laptop. Tutorial materials, including slides, screenshots, exercises, sample files and solutions will be available before the tutorial and will remain permanently online at the Ensembl training portal: https://training.ensembl.org.

Learning objectives

● Outline the different data types available through Ensembl.
● Identify the appropriate methods for data retrieval from Ensembl.
● Perform queries and extract data returned from the Ensembl REST API.

Schedule

Time	Activity
09:00 – 09:30	Introduction to the Ensembl genome browser and REST API
09:30 – 10:30	The Ensembl REST API: endpoints and documentation, this will include exploring example scripts and exercises using Jupyter notebooks
10:30 – 10:45	Coffee break
10:45 – 12:00	The Ensembl REST API: writing scripts to retrieve and process Ensembl data programmatically, this will include exploring example scripts and exercises using Jupyter notebooks
12:00 – 12:15	Wrap-up, Q+A and feedback
12:15 – 13:00	Lunch - Join us for lunch, even if you only attend the morning tutorial!

Target audience and requirements

Maximum number of participants: 30

The tutorial is aimed at new and existing Ensembl users, from both the wet-lab and bioinformatics communities. The tutorial is designed to provide participants with a greater understanding of the data available through the Ensembl interfaces and how to efﬁciently retrieve it at various scales. There are no prerequisites for this tutorial, although a basic understanding of programming with Python or R would be beneﬁcial. For the interactive aspects of this tutorial, participants are required to bring their personal laptops. Tutorial materials, including slides, screenshots, exercises, sample ﬁles and solutions will be available before the tutorial and will remain permanently online at the Ensembl training portal.

Organizers

Louisse Mirabueno, European Molecular Biology Laboratories - European Bioinformatics Institute (EMBL-EBI), UK

Louisse Mirabueno completed a Bachelor’s degree in Genetics and Cell Biology at Dublin City University, and obtained an MPhil. at the University of Reading. She has worked in both academic institutes and industry. Her experience ranges from genomic analyses in vertebrates and bacterial populations, high-throughput screening in biotech and early-stage drug discovery, and research and development in long-read sequencing. Louisse joined the Ensembl Outreach team in January 2022 and has since delivered 28 in-person and virtual workshops teaching participants how they can integrate Ensembl data in their research.

Microbial genomics: From raw data to functional annotations to OpenGenomeBrowser (1/2 day - PM)

Overview

Beginner level – Long-read sequencing – genome annotations – orthologs

Today, bacterial genomes can be sequenced for less than 200 dollars per genome. If sequenced with long-read technologies, it is often possible to assemble the data into complete bacterial chromosomes including plasmids. In this beginner-friendly tutorial, we will show and explain the steps required to go from raw sequence data to an assembled and annotated genome made accessible through OpenGenomeBrowser. To provide a hands-on experience, each participant will receive a PacBio HiFi raw dataset of one bacterial strain, with which the entire workflow will be demonstrated. In the end, the products of all participants will be collected, distributed to everyone, and imported into OpenGenomeBrowser.

Because the software used in this tutorial are resource-efficient algorithms and are provided as Docker containers, participants will be able to follow the tutorial on their own laptops or using a shell in the cloud. In case one step in the pipeline does not work, the intermediate results will be made available so that the participant can continue the tutorial.

The following tools will be used during the tutorial:
- Assembly: La Jolla Assembler (LJA) (Bankevich et al. 2022)
- Polishing: HyPo (Kundu et al. 2019)
- Structural annotations: bakta (Ekim et al. 2021)
Import into genome database: OpenGenomeBrowser (Roder et al. 2022)

Learning objectives

The objectives of the tutorial are:

Getting an overview of a microbial genome assembly and annotation pipeline
Understanding challenges in microbial genome data management
Understanding orthologs
Gaining experience using Docker: basic data processing and hosting the OpenGenomeBrowser stack (3 Docker images coordinated using Docker Compose)

Schedule

12:15 – 13:00

Lunch - Join us for lunch, even if you only attend the afternoon workshop!

Time	Activity
13:00 – 13:45	Theory: Overview of a microbial genomics pipeline, challenges in genome data management
13:45 – 14:30	Practice: download raw reads, run LJA, HyPo and bakta
14:30 – 14:45	Coffee break
14:45 – 15:30	Theory: functional annotations, orthologs and OpenGenomeBrowser
15:30 – 16:15	Practice: download toy dataset of ~ 10 processed genomes, import into OpenGenomeBrowser

Target audience and requirements

Maximum number of participants: 30

The tutorial is aimed at computational biologists, bioinformaticians, researchers and trainers working in life science who want to learn about the basics of microbial genomics. Participants are expected to have basic experience in Bash. If they want to work locally, participants should install the latest version of docker beforehand. If not, any laptop can be used to follow the tutorial in the cloud.

Organizers

Thomas Roder, PhD student at analyst at Interfaculty Bioinformatics Unit of the University of Bern, Switzerland
Marco Kreuzer, Bioinformatician and data analyst at Interfaculty Bioinformatics Unit of the University of Bern, Switzerland
Heidi Tschanz-Lischer, Bioinformatician and data analyst at Interfaculty Bioinformatics Unit of the University of Bern, Switzerland

Workshops

Workshops encourage participants to discuss technical issues, exchange research ideas, and share practical experiences on some focused or emerging topics in bioinformatics. Please note that all workshops will be organized as a full day event (9:00 - 16:15) except for one a half-day workshop "Standardization of single-cell metadata an Open Research Data initiative" which will take place from 13:00 - 16:15. If you choose to participate in the 1/2 day workshop, you are welcome to register for another half-day tutorial if space is available.

All workshops take place at the Biozentrum -the new building- (University of Basel, Spitalstrasse 41, 4056 Basel).

Mechanistic and AI digital twins in personalized medicine - two sides of the same coin (full day)

Overview

Beginner level – systems biology – modelling – digital twins – data annotation and curation – personalized medicine

Digital twins are an emergent concept in Computational Systems Biology and personalized medicine. Many initiatives manifest an interest of creating computational models of increasing complexity that could be used to represent virtual patients and help in the decision-making regarding the appropriate treatment. Having robust and reliable computational models spanning all biological layers such as gene expression, signalling and metabolism to name a few, could revolutionize the way we treat Big Data for the benefit of precision medicine and improved medical care - tailored to the needs of each patient. As the number of computational models rapidly increases, the production of data is ever growing and the approaches, both mechanistic and AI-based, are rapidly developing, discussions about challenges and best practices are needed more than ever.

The purpose of the workshop is to bring together researchers working in the field of digital twins in computational systems biology who use various formalisms to address challenges of data integration and model personalization. The focus will be on presenting the state of the art in the field and how mechanistic computational modelling can upscale benefiting from AI-based methodological advances.

Several experts on computational modelling for personalized medicine, AI-based data integration and clinical digital twins will be invited (see list of tentative speakers below). In addition, a call for abstract submissions will be issued.

List of tentative invited speakers:

Liesbet Geris, VPH Institute, BE
Maria Rodriguez Martinez, IBM Zurich, CH
Mariano Vazquez, BSC, ES
Kristin Reiche, Fraunhofer Institute for Cell Therapy and Immunology, Leipzig, DE
Jurgen Kuball or Zsolt Sebestyen, UMC Utrecht Cancer Centre, NL
Ioannis Xenarios, UNIL/CHUV/Health2030 Genome Center, CH

Schedule

The morning sessions will be dedicated to AI-based approaches for medical digital twins; the afternoon sessions will cover the challenges of the field, addressing both ethical and technical aspects (storage, access and protection of sensitive data, calculation power computational frameworks). Each session will feature one invited talk from a top expert in the field.

Time	Activity
09:00 – 09:10	Welcome and introduction
09:10 – 10:30	Session 1 – AI digital twins - the power of algorithms for big data in medical care Chair: Arnau Montagud, BSC, ES - Maria Rodriguez Martinez, IBM Zurich, CH, “Interpretable deep learning to model the immune system” (35 min + 5 min Q&A) - Vincent Noël, Institut Curie, FR, “Building virtual twins of tumors using PhysiBoSS” (15 min + 5 min Q&A)
10:30 – 10:45	Coffee break
10:45 – 12:05	Session 2 – Mechanistic digital twins - unravelling mechanisms through causal interactions Chair: Vincent Noël, Institut Curie, FR - Kristin Reiche, Fraunhofer IZI, DE, “Power and Hurdles of integrating multi-omics data into medical virtual twins” (35 min + 5 min Q&As) - Gautier Stoll, Institut Gustave Roussy/Université Paris 5, FR “Mathematical model of CAR T cell therapy” (15 min + 5 min Q&A) - Anna Niarakis, Evry University, FR, “Hybrid approaches for modelling Rheumatoid arthritis” (15 min + 5 in Q&A)
12:15 – 13:00	Lunch break
13:00 - 14:40	Session 3 – Scaling up - feasibility of simulations for medical digital twins Chair: Anna Niarakis, Evry University, FR - Liesbet Geris, “From digital twins in healthcare to the integrated Virtual Human Twin” VPH Institute, BE (35 minutes + 5 min Q&A) - Mariano Vazquez, “Supercomputer-based virtual humans for cardiovascular therapies" BSC, ES (35 min + 5 min Q&As) [online session] - Thalia Diniaco, BSC, ES "PhysiCell-X is a multiscale modelling framework that brings us closer to the Digital Twins" (15 min + 5 min Q&A)
14:40 - 14:55	Coffee break
14:55 - 16:05	Session 4 – Challenges in clinical and preclinical implementation Chair: Laurence Calzone, Institut Curie, FR - Jurgen Kuball “Targeting cancer with gdT cell receptors”and Zsolt Sebestyen “3D models to ass was potency and efficacy”, UMC Utrecht Cancer Centre, NL (35 min + 5 min Q&A) - Ioannis Xenarios, Health2030 Genome Center, CH (35 min+ 5 in Q&A)
16:05 – 16:35	Discussions and closing remarks - Anna Niarakis, Evry University, FR, “A US-EU community effort to build Immune Digital Twins” (10 min) - Panel on “Digital twins in personalized medicine: challenges and opportunities” with: Maria Rodriguez Martinez, IBM, CH Kristin Reiche, Fraunhofer IZI, DE Liesbet Geris, VPH Institute, BE Jurgen Kuball, UMC Utrecht Cancer Centre, NL Zsolt Sebestyen, UMC Utrecht Cancer Centre, NL Ioannis Xenarios, Health2030 Genome Center, CH

Target audience and requirements

Maximum number of participants: 40

This workshop addresses students and researchers interested in learning about this emerging and promising concept of digital twins, giving them a unique opportunity to discover and discuss the state of the art in the field.

Call for abstracts

The organizers would like to invite interested scientists to submit an abstract for a selected talk. The abstract should be in pdf form, 400 words maximum with figures if needed, and not exceeding 1 page. Submissions should be sent to laurence.calzone@curie.fr with title: [BC]2 workshop abstract submission

Deadlines

Abstracts sent by: 20 June 2023
Decision on selected abstracts: 28 June 2023
[BC]² workshop early-bird registration deadline: 30 June 2023
Workshop: 11 September 2023

Organizers

Anna Niarakis (Associate Professor; UEVE, Univ Paris-Saclay & INRIA Saclay; France)
Laurence Calzone (Research Scientist; Institut Curie, U900 INSERM & Mines Paris Tech; France)
Arnau Montagud (Postdoctoral Researcher at the Barcelona Supercomputing Center; Spain)

Standardization of single-cell metadata: an Open Research Data initiative (1/2 day - PM)

Overview

Single-cell functional genomics is a rising field of biology, which is bringing major insight into the life sciences. Single-cell data are rapidly increasing both in quantity and in diversity, but lack method and metadata standardization. While some large projects have clear standards of reporting, most datasets in biological databases have partial or non-standardized metadata. This leads to multiple non-compatible standards across datasets, and limits reusability, which in turn presents challenges to making these data useful to an increasing community of specialists and non-specialists.

There is a need for standards in the way single-cell data are stored and annotated, especially for cell type and other associated information. Indeed, metadata is critical to the capacity to use these large and potentially very informative datasets. It includes protocols, which constrain which transcripts were accessible or which normalizations are relevant, the association between barcodes and annotations, or the methods used to identify cell types. Existing ontologies and controlled vocabularies are not used systematically, even when information is reported.

To address these limitations, the funded project scFAIR has the mission to build a collaborative platform supporting and disseminating ORD practices for the single-cell genomics community, both for sharing datasets and their metadata, and for standardizing the way data are shared across datasets. During this workshop, participants will have the opportunity to shape the definition of this collaborative platform, by sharing their needs and best practices. This workshop will notably cover:

how to describe and capture information about processing pipelines, in order to reproduce analyses, and determine how cell features were identified;
how to provide in a standard way information about barcodes identifying cells, and the raw data of the analysis;
how to annotate cell clusters with controlled vocabularies and ontologies;
best practices in quality control;
best practices in cell type inference.

On each topic, after an introduction by a field expert, we will ask participants to share the difficulties they face in their own work, and what they miss to leverage to their full potential existing single-cell data. We will then work together to synthesize recommendations. We expect that, as a result, participants will acquire an improved knowledge of the existing landscape in single-cell biocuration, and a collaborative involvement in the future of Open Research Data in scFAIR.

Contributions

Selection mode: both invitation and call for submissions
The organizers will also contribute presentations. We will also invite experts in the single-cell field, from the joint working group between SIB and AGORA Cancer Research Center.

Invited speakers

Jason Hilton, Stanford University
Jason is the project lead for Lattice, a data coordination project supporting the Human Cell Atlas, and part of the team managing the CZI CELLxGENE resource.
David Osumi-Sutherland, EMBL-EBI
David leads a group of Ontology editors and developers within the Samples, Phenotypes and Ontologies (SPOT) team at the EBI, providing semantic solutions for annotating and organizing biomedical data following FAIR principles. His team notably contributes to the Cell Ontology and Provisional Cell Ontology.

Target audience and requirements

Maximum number of participants: 20

The workshop targets biologists, bioinformaticians, or curators, working or intending to work on single-cell data.

Organizers

Frédéric Bastian: associate group leader of the SIB team developing the Bgee database, specialized in gene expression representation, biocuration and bio-ontologies.
Vincent Gardeux: Senior scientist at Deplancke’s lab at EPFL, developing the ASAP portal for single-cell RNA-seq analysis.
Marc Robinson-Rechavi: professor of bioinformatics at the department of ecology and evolution of UNIL, and group leader of the SIB team developing the Bgee database.

Evaluation committee

Patricia Palagi, Head of Training, SIB Training Group, SIB Swiss Institute of Bioinformatics, Switzerland
Tülay Karakulak, PhD candidate, SIB Swiss Institute of Bioinformatics and Institute of Molecular Cancer Research of the University of Zurich, Switzerland
Aiswarya Prasad, PhD candidate, SIB Swiss Institute of Bioinformatics and the Department of Fundamental Microbiology of the University of Lausanne, Switzerland

The [BC]² tutorial and workshop session is coordinated by the SIB Training Group. For more information please contact bc2@sib.swiss.

Rules and responsibilities

On-site participation and presentation

All tutorials and workshops will take place onsite in Basel. To foster exchanges and interactivity during the sessions, participants, organizers and presenters must be present in-person
In limited cases, the evaluation committee can approve the virtual participation of a tutorial/workshop presenter if unable to travel or attend onsite. Please indicate the reason for virtual attendance when submitting

[BC]²Registration fees - travel and accommodation costs

Each tutorial/workshop will receive up to two free conference registrations for organizers or speakers.
[BC]² will not cover travel and accommodation costs from tutorial/workshop organizers or presenters (read below).

[BC]²will be responsible for:

Providing a meeting venue with necessary technical equipment and catering services during coffee breaks
Providing staff to help with the on-site/online organization
Announcing the detailed schedule of the tutorial/workshop on the conference website
Advertising on [BC]² social media (based on material/info received from organizers). FYI, a social media banner template will be provided to organizers to ensure consistency with the visual identity of the conference
Update organizers, on a monthly basis, about the number of registered participants
[BC]² and SIB Training Group reserve the right to cancel a scheduled tutorial/workshop if registration one month before the conference is less than 10 participants

Tutorial / workshop organizers will be responsible for:

Finding financial support for the organization of the tutorial/workshop. Tutorial/workshop organizers are highly encouraged to seek independent funding for travel and accommodation of their speakers / presenters.
Advertising the tutorial/workshop and distributing its call for papers/participation (if applicable). This includes promoting their event to relevant newsgroups and mailing lists, and especially to potential audiences from outside the core [BC]² conference community.
Finalising the programme/detailed schedule (incl. name of speakers) and providing material by the specific conference deadlines (see key dates)
Compiling and distributing material to the participants (if applicable). Note that you will be able to upload material as pdf on the conference website upon request and that no photocopying of handouts will be managed by the conference.
Leading the event at [BC]².

Open and FAIR

Note that organizers are responsible for ensuring that the tutorial and workshop materials are legally used, and that appropriate copyright permissions have been arranged. Lecturers will guarantee that tutorial and workshop materials are as much as possible open and FAIR. They agree that their material may be made available in any form by the conference to [BC]² conference participants.

Tutorials & workshops

Tutorials

Single-Cell Transcriptomics: get started! (full day)

[FULLY BOOKED] Overview

Learning objectives

Schedule

Audience and requirements

Organizers

Introduction to Spatial Transcriptomics Data Analysis (full day)

Overview

Learning objectives

Schedule

Audience and requirements

Organizers

SQL for data science (full day)

Overview

Schedule

Audience and requirements

Organizers

Make your research FAIRer with Quarto, GitHub and Zenodo (full day)

Overview

Learning objectives

Schedule

Audience and requirements

Organizers

Omnibenchmark: open continuous community-driven benchmarking of computational methods (1/2 day - AM)

Overview

Learning objectives

Schedule

Audience and requirements

Organizers

Using the Ensembl REST API to retrieve genome annotation data (1/2 day - AM)

Overview

Learning objectives

Schedule

Target audience and requirements

Organizers

Microbial genomics: From raw data to functional annotations to OpenGenomeBrowser (1/2 day - PM)

OverviewBeginner level – Long-read sequencing – genome annotations – orthologs

Learning objectives

Schedule

Target audience and requirements

Organizers

Workshops

Mechanistic and AI digital twins in personalized medicine - two sides of the same coin (full day)

Overview

Schedule

Target audience and requirements

Call for abstracts

Deadlines

Organizers

Standardization of single-cell metadata: an Open Research Data initiative (1/2 day - PM)

Overview

Contributions

Invited speakers

Target audience and requirements

Organizers

Evaluation committee

Rules and responsibilities

Open and FAIR

Overview

Beginner level – Long-read sequencing – genome annotations – orthologs