Tutorials & workshops -

Tutorials
Workshops

[BC]²'s tutorials and workshops provide an informal setting to learn about the latest bioinformatics methods, discuss technical issues, exchange research ideas, and share practical experiences on focused or emerging topics in bioinformatics.

Tutorials and workshops take place on Monday 13 September, from 9:00 - 16:00. Tutorials will be at the Kollegienhaus of the University of Basel and workshops at Biozentrum (new building).

General schedule (see tutorials' and workshops' descriptions for more details on the respective content)

Time	Activity
08:15 – 09:00	Registration
09:00 – 10:15	Tutorials & Workshops
10:15 – 10:45	Coffee break
10:45 – 12:15	Tutorials & Workshops
12:15 – 13:30	Lunch break
13:30 – 16:00	Tutorials & Workshops
17:00	[BC]² Welcome lecture

Please note that you need to register for your tutorial or workshop of choice via the online registration system (once registrations are open), they are not included in the [BC]² registration fee. Tutorials and workshops have a limited number of participants they can take - register on time!

COVID-19 and on-site participation

Tutorials and workshops are planned as on-site events with a hygiene concept closely coordinated with the cantonal and national decisions in order to ensure maximum safety for our participants.

Participation at the tutorials and workshops is given on a first-come, first-served basis. Please note that room capacities may be influenced by the local COVID-19 regulations in September. In case that we have to accept fewer people than foreseen, participants who registered late to a tutorial or workshop will not be able to attend. Already paid registration fees for the tutorial or workshops will be reimbursed.

In the event of restrictions preventing the tutorials and workshops to take place in person, participants will be informed about a probable virtual version.

Tutorials

Tutorials aim to provide participants with lectures or practical training covering topics relevant to the bioinformatics field. Please note that some of the tutorial schedules still need to be adapted to align with the start of the [BC]² Welcome lecture.

All tutorials take place at the Kollegienhaus (University of Basel, Petersgraben 50, 4051 Basel).

Inferring gene regulatory networks from high-throughput data

Overview

beginner level –– high-throughput data analysis –– gene expression –– gene regulatory networks

Do you have transcriptomic or epigenomic data that measures changes in gene expression or chromatin state across a set of conditions and want to know which regulators and regulatory interactions are driving these changes? Our laboratory has developed tools that model transcriptomic (i.e. RNA-seq) or epigenomic (i.e. ChIP-seq, DNase-seq, or ATAC-seq) in terms of computationally predicted binding sites of transcription factors and microRNAs, to infer the key regulatory interactions that are driving changes in gene expression and chromatin state.

In this tutorial, we will introduce users to two tools: ISMARA, which analyzes gene expression data, and CREMA, which analyzes chromatin state data to infer activities of cis-regulatory elements (i.e. both promoters and distal enhances) genome-wide. Both tools are implemented as web servers that take raw sequencing data as input, perform all modelling of this data in a completely automated manner, and provide comprehensive predictions of the regulatory network and regulatory interactions through an interactive online interface. Both ISMARA and CREMA are highly sophisticated tools that provide users a large number of interactive possibilities to explore predictions and to generate new analyses of the data. The main objective of this tutorial is to allow attendees to develop an in-depth understanding of the variety of analyses that these tools provide, and how to optimally use them for answering specific questions about the regulatory network operating in the user’s system of interest.

Learning objectives
The attendees of the tutorial will learn how to use the ISMARA and CREMA systems, and obtain an in-depth exploration of all the analysis results that the system provides including:

What are the key regulators and how do their activities change across the samples?
What genes and pathways are targeted by each regulator?
What is the core network of interactions between the key regulators.
What are the main regulators of a particular gene or enhancer element, and how does each regulator contribute to its expression/activity across the samples?
Exploring embedded links to the String and SwissRegulon databases.
How to average across replicate data and how to calculate contrasts between subsets of samples.
How to download comprehensive predictions to allow further downstream processing and analysis of these results.

At the end of the tutorial the attendants should have the expertise to perform sophisticated regulatory network predictions from RNA-seq or ChIP/ATAC/DNase-seq data using the ISMARA and CREMA tools.

Schedule

Time	Activity
09:00 – 10:15	Introduction to Motif Activity Response Analysis. Transcription factor binding site predictions, processing of raw RNA-seq and ChIP-seq data, and the MARA model.
10:15 – 10:45	Coffee break
10:45 – 12:15	Introduction to CREMA. Identification of cis-regulatory elements (CREs) genome-wide and application of the MARA model. Overview of the analysis results
12:15 – 13:30	Lunch break
13:30 – 14:30	Using ISMARA and CREMA: data types, data upload, web interface, advanced interactive analysis features and command line tools.
14:30 – 16:00	Questions and answers session, exploring user provided datasets
17:00	[BC]² Welcome lecture

Audience and requirements

Maximum number of participants: 16

The tutorial is targeted to a broad audience. This meeting should be of interest to all people interested in inference of gene regulation from gene expression and chromatin data including computational biologists, bioinformaticians and experimental biologists. Since no specific bioinformatic skills are required in order to perform the analysis, purely experimental researchers with no data analysis background are also encouraged to attend.

The participants are expected to be familiar with the molecular biology of gene regulation. No specific bioinformatics or data analysis skills are required.

Users should, whenever possible, bring their own laptop and upload their own data in advance to allow exploration of results on their own data during the tutorial. A wireless connection will be needed for users to be able to interact with the system.

Organisers

Erik van Nimwegen (Professor and Group Leader; University of Basel & SIB Swiss Institute of Bioinformatics; Switzerland)
Mikhail Pachkov (Senior Research Assistant; University of Basel & SIB Swiss Institute of Bioinformatics; Switzerland)

Semantic representation of clinical data in RDF

Overview

beginner level –– clinical data –– data interoperability –– semantics –– RDF (resource description framework) –– FAIR

Healthcare information is collected in very diverse systems. Often, ad hoc databases or data models are created for a specific medical use case. When dealing with heterogeneous and sensitive health-related data in the research setting, one of the biggest challenges is to bring the data together and achieve interoperability. The SPHN - Swiss Personalized Health Network¹ provides a semantic framework to foster interoperability of health-related data across the fragmented Swiss health care systems. During the past two years, a Resource Description Framework (RDF)-based FAIR (Findable, Accessible, Interoperable and Reusable) data framework has been developed to define, represent and store clinical data using common semantics. Biomedical data from any institution, implementation or platform can be expressed in this framework and semantically annotated. The encoded data are represented as standard URIs that allow direct linking to common ontologies such as SNOMED CT. The flexibility of RDF enables the use of other existing standard terminologies of interest such as ICD-10, LOINC or SNOMED CT and extendable for the unforeseen. This allows the researcher to use the knowledge of the terminology together with their data.

Learning objectives
Participants will learn how to access, understand and use the SPHN RDF schema. After the course, they will know about the infrastructure/tool stack provided by SPHN and BioMedIT to handle the SPHN RDF data. The tutorial's main objectives are to allow participants to:

Learn about the SPHN RDF schema and the use of the SPHN infrastructure
Learn how to visualize and browse graph-based data
Learn how to query data with SPARQL
Learn how to do reasoning using external terminologies such as SNOMED CT
Learn about common packages for Python and R to be used with RDF data

Schedule

Time	Activity
09:00 – 09:10	Welcome and Introduction
09:10 – 09:30	The Swiss Personalized Health Network infrastructure: Introduction and aim
09:30 – 10:00	RDF as exchange format for clinical data in SPHN: Why and how?
10:00 – 10:15	SNOMED CT in SPHN
10:15 – 10:45	Coffee break
10:45 – 11:15	How to expand the SPHN RDF schema for your project
11:15 – 12:15	How to visualize your own schema and data
12:15 – 13:30	Lunch break
13:30 – 14:30	How to query your data with SPARQL
14:30 – 15:00	How to integrate terminologies such as SNOMED CT to do reasoning
15:00 – 16:00	How to use Python and R with RDF data
17:00	[BC]² Welcome lecture

Audience and requirements

Maximum number of participants: 12

This tutorial is for (clinical) scientist working with RDF data or interested in applying RDF to their clinical data. Attendees should have very basic bioinformatics and data analysis skills. A basic knowledge about RDF and graph-based technologies is helpful but not required.

Organisers

Sabine Österle (Team Lead Data Interoperability; SIB Swiss Institute of Bioinformatics; Switzerland)
Vasundra Touré (Scientific Coordinator; SIB Swiss Institute of Bioinformatics; Switzerland)
Kristin Gnodtke (Senior Clinical Data Specialist; SIB Swiss Institute of Bioinformatics; Switzerland)

¹ SPHN is a national initiative with the goal of developing, implementing and validating a coordinated data infrastructure, in order to make health-relevant data interoperable and shareable for research in Switzerland. An integral part of SPHN is the BioMedIT project: a national IT infrastructure backbone that enables nationwide health-data exchange for research. BioMedIT provides researchers all over Switzerland with access to a secure and protected computing environment for analysis of sensitive data without compromising data privacy.

Understanding protein and glycoprotein structure – activity relationship

Overview

beginner level –– protein 3D structure and activity –– glycosylation –– personalized medicine and oncology

Experimental or modelled 3D structures are widely used as a main source of information in the studies of the structure-activity relationships of proteins. However, post-translational modifications (PTM), including glycosylation, are often neglected even though they are known to play a major role in protein structure stability, solubility, protein-protein recognition and resistance to aggression.

This tutorial will demonstrate why and how crossing information regarding protein structures, sequences and glycosylation can help get a better understanding of protein structure and activity. It will also show the utility of such analyses in the field of personalized medicine.

Learning objectives
At the end of this tutorial, attendees should be able to interpret 3D structures of proteins in terms of molecular interactions, understand the effect of mutations on the modified protein structure and activity, appreciate the role of structural bioinformatics in precision oncology and be able to use the related Web tool Swiss-PO.ch. Examples will be taken from real mutations occurring in cancer cells of patients that were discussed during the weekly Molecular Tumor Board of the “Reseau Romand d’Oncologie”. Attendees will also be introduced to considering the effect of gaining or losing a glycosylation site on protein structure and function following the examples provided with Swiss-PO.ch. The main aspects covered are:

Introduction to molecular interactions
Introduction to protein structure and activity
Introduction to post-translational modifications with a focus on glycosylation
Introduction to GlyConnect: browse and search site-specific glycosylation data
Introduction to GLYCAM-Web: how to model glycoproteins
Introduction to the role of structural bioinformatics in precision oncology
Introduction to Swiss-PO.ch: objectives, content and how-to

Schedule

Time	Activity
09:00 – 09:45	Introduction to molecular interaction, and protein structure and activity
09:45 – 10:15	Introduction to glycosylation as a distinct post-translational modification: 2D and 3D representations
10:15 – 10:45	Coffee break
11:45 – 12:15	Tutorial & exercises: using GlyConnect and GLYCAM-Web
12:15 – 13:30	Lunch break
13:30 – 14:00	Introduction to the role of structural bioinformatics in precision oncology
14:00 – 15:15	Tutorial & exercises: using Swiss-PO.ch
15:15 – 16:00	Open discussion and closing remarks
17:00	[BC]² Welcome lecture

Audience and requirements

Maximum number of participants: 16

This tutorial is targeting a wide audience ranging from Master students to senior researchers with and interest in proteomics, structural bioinformatics and in protein structure and activity.

Attendees will need a laptop equipped with a recent browser (Google Chrome or Firefox).

Organisers

Vincent Zoete (Assistant Professor and Group Leader; UNIL-CHUV, Ludwig Institute for Cancer Research, University of Lausanne & SIB Swiss Institute of Bioinformatics; Switzerland)
Fanny Krebs (Postdoctoral Researcher; UNIL-CHUV, Ludwig Institute for Cancer Research, University of Lausanne & SIB Swiss Institute of Bioinformatics; Switzerland)
Oliver Grant (Research Scientists; University of Georgia; USA)
Frédérique Lisacek (Group Leader; SIB Swiss Institute of Bioinformatics & University of Geneva; Switzerland)

Beyond the usual Docker tutorial - Web apps & CI

Overview

intermediate level –– reproducibility –– data management –– container

Going beyond the usual 20-minute Docker tutorial... lots of tutorials on the net leave the user at the point where they can run a simple application out of a single container. But how do you get your own full-blown web service in the cloud thereafter?

Connecting your web service to the cloud requires multiple containers working together in the scope of container orchestration. In a real-world setup, this requires the containers to be provided via a container registry. Finally - since no one likes to rebuild containers manually for every code change - the whole deployment should be automated by continuous integration (CI).

Participants of this tutorial will learn how to take a (Django) web app, distribute it into multiple containers and run it via Docker Compose as an entry-level orchestration tool. The pipeline will be stored within a Git repository allowing participants to learn about GitLab's CI/CD functionality and to integrate it into their projects. To not store confidential information like passwords inside the Git repository while using the CI/CD, GitLab secrets will be a topic too.

Learning objectives
After the tutorial, attendees should have an idea on how to organise a web application including the web server, databases and other components in Docker Compose. In addition, they will also learn GitLab CI/CD allowing them to automatically deploy their web application.

Schedule

The coarse-grained schedule for the tutorial is:

Introduction of the example application
- How is it set up on bare-metal?
- How is it set up with Docker Compose?
- Some Docker Best Practices
Container orchestration
- Introduction of different orchestration tools
- Docker Compose configuration (of the example application)
CI/CD
- Introduction of CI/CD terminology
- GitLab CI configuration
- GitLab CI secrets
- GitLab Runners

Audience and requirements

Maximum number of participants: 17

This tutorial is suited for developers, web-developers, Dev-Ops and sysadmins. Attendees should have a basic experience with Docker and Docker Compose and should know Git and GitLab (or similar system). Basic Linux shell experience is also required.

Attendees should bring their own laptop with an SSH client and Git installed plus their favourite editor.

Organisers

Stefan Bienert (Bioinformatician; SIB Swiss Institute of Bioinformatics; Switzerland)
Pablo Escobar López (Linux Sysadmin; SIB Swiss Institute of Bioinformatics; Switzerland)
Jaroslaw Surkont (Bioinformatician; SIB Swiss Institute of Bioinformatics; Switzerland)

Defining genomic signatures with non-negative matrix factorization

Overview

intermediate level –– genomic data analysis –– single-cell –– R

In the biological and clinical context, the identification of molecular signatures and corresponding feature extraction are two critical steps to understand diverse biological processes. In particular, a signature is defined as a group of molecular features (e.g. genes or genomic regions) that are sufficient to identify certain genotype or phenotype. For instance, expression signatures link a phenotype to a certain pattern of gene expression^1,2 whereas enhancer signatures define subtypes based on the regulatory landscape³.

Non-negative Matrix Factorization (NMF) has been widely used for the analysis of genomic data to perform feature extraction and signature identification^4,5. However, running a basic NMF analysis requires the installation of multiple tools and dependencies, along with a steep learning curve and computing time. To mitigate such obstacles, we developed ButchR and ShinyButchR⁶, a novel NMF toolbox that provides a complete NMF-based analysis workflow, allowing the user to perform matrix decomposition using NMF, feature extraction, interactive visualization, relevant signature identification and association to biological and clinical variables.

Learning objectives
The aim of this tutorial is to learn how to use ButchR to perform signature identification in different types of genomic data. To explore the results of an NMF analysis, we will provide a ready to use Docker image with RStudio, ButchR, and pre-loaded publicly available datasets, including bulk and single-cell RNA-seq data, as well as an interactive application. The tutorial will show how to run an NMF-based analysis from start to end.

Schedule

Time	Activity
Session 1 - Introduction
09:00 – 09:30	Ice breaker: Course expectations
09:30 – 10:15	Introduction to Non-Negative Matrix Factorization (NMF) and its usage in genomics
10:15 – 10:45	Coffee break and discussion
Session 2 - Matrix decompensation
10:45 – 11:15	How to use ButchR with Docker
11:15 – 11:45	Pre-processing data to use with NMF
11:45 – 12:15	Matrix decomposition with ButchR
12:15 – 13:30	Lunch break
Session 3 - Results interpretation
13:30 – 14:00	Selection of optimal factorization rank
14:00 – 14:30	Signature identification
14:30 – 15:00	Feature extraction and enrichment analysis
15:00 – 15:30	Interactive analysis with ShinyButchR
Session 4 - Discussion
15:30 – 16:00	Discussion and concluding remarks
17:00	[BC]² Welcome lecture

Audience and requirements

Maximum number of participants: 12

This tutorial is for computational biologists dealing with large scale omics datasets (e.g. RNA-seq, ATAC-seq, …) looking for solutions to reduce the dimensionality of the data to a small set of informative signatures.

The attendees are expected to bring their own laptop with Docker pre-installed. To avoid any delay in setting up the container during the practice sessions, the Docker image for the workshop should be downloaded beforehand. This can be done by opening a command-line terminal (e.g., Powershell and Terminal) and running the command “docker pull hdsu/butchr”. A complete overview of how to install Docker can be found here: https://docs.docker.com/desktop/. In addition, a detailed explanation of how to use the ButchR docker image can be found here: https://hub.docker.com/r/hdsu/butchr. Basic R coding skills will be helpful, although the tutorial will cover all the steps, from loading data to exporting results.

Upon arrival, the attendees will receive an R Markdown file with a step-by-step guide of how to use ButchR and ShinyButchR including an example dataset and how to interpret the NMF results.

Organisers

Carl Herrmann (Group Leader, University Clinics Heidelberg, Germany)
Andres Quintero (PhD candidate, University Clinics Heidelberg, Germany)

References

^1. Szymczak, F., Colli, M. L., Mamula, M. J., Evans-Molina, C. & Eizirik, D. L. Gene expression signatures of target tissues in type 1 diabetes, lupus erythematosus, multiple sclerosis, and rheumatoid arthritis. Sci. Adv. 7, (2021).

^2. Sotiriou, C. & Pusztai, L. Gene-Expression Signatures in Breast Cancer. N. Engl. J. Med. 360, (2009).

^3. Gartlgruber, M. et al. Super enhancers define regulatory subtypes and cell identity in neuroblastoma. Nat. Cancer 2, (2021).

^4. Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature (2020). doi:10.1038/s41586-020-1943-3

^5. Pal, S. et al. Isoform-level gene signature improves prognostic stratification and accurately classifies glioblastoma subtypes. Nucleic Acids Res. 42, e64 (2014).

^6. Quintero, A. et al. ShinyButchR: Interactive NMF-based decomposition workflow of genome-scale datasets. Biol. Methods Protoc. 5, (2020).

^7. Liu, L. et al. Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity. Nat. Commun. 10, (2019).

Deep learning on biological sequences: from HMMs to RNNs

Overview

intermediate level –– biological sequences –– data analysis –– deep learning –– python

The abundance of biological sequence data is growing exponentially, and is revolutionizing many fields of research, from medicine and agriculture to energy and manufacturing. Machine learning offers a powerful toolkit to computational biologists for interpreting and capturing value from this vast ocean of data. One particularly important machine learning task is to identify patterns from variable-length biological sequences that are associated with a functional outcome. This tutorial will introduce attendees to a technique for pattern recognition with a long tradition in bioinformatics—Hidden Markov Models (HMMs). With that as a backdrop, we will then introduce a modern approach to pattern recognition—Recurrent Neural Networks (RNNs). We will learn about these two algorithms in the context of a codon optimization problem, where we train each model to design a gene sequence from a protein sequence in an optimal way for expression in a new host organism. Within this well-understood context, we will explore how each model is structured and the associated assumptions. We will outline algorithms for exploiting the models, and compare the advantages and disadvantages of these two frameworks. We will gain practical experience by performing codon optimization with open-source software implementations of these models. We will finish by discussing the ways that RNNs are being leveraged in recent computational biology publications.

Learning objectives
After this tutorial, participants will:

Understand the structure of HMMs and RNNs, and the relative strengths of each approach.
Have solved a bioinformatics codon optimization problem using both modeling approaches by means of freely available Python packages.
Have connected their new knowledge with recent uses of RNNs in computational biology research.

Schedule

Time	Activity
09:00 – 09:30	Overview of tutorial, set up computing environment.
09:30 – 10:30	Introduction to Hidden Markov Models (HMMs). Discuss assumptions. Introduce Forward-Backward and Viterbi algorithms for estimating probable hidden states given an HMM and input sequence. Work through examples of HMM setup, use, and interpretation with toy examples.
10:30 – 10:45	Coffee break
10:45 – 12:30	Introduction to codon optimization. Walk through an example of HMM setup based on a host genome, application to the codon optimization of an exogenous protein, and evaluation of the results, all using the hmmlearn package.
12:30 – 13:30	Lunch break
13:30 – 14:30	Introduction to Recurrent Neural Networks (RNNs). Discuss assumptions. Introduce back-propagation for parameter fitting. Work through examples of RNN setup, training, prediction and interpretation with toy examples.
14:30 – 15:30	Walk through an example of RNN setup and training based on a host genome, application to the codon optimization of an exogenous protein, and evaluation of the results, all using the PyTorch package.
15:30 – 16:00	Highlight recent computational biology literature using RNNs. Connect what we practiced with the literature, and expand on the possibilities of model application
17:00	[BC]² Welcome lecture

Audience and requirements

Maximum number of participants: 24

This tutorial is intended for participants with past experience using Python and familiarity with introductory molecular biology concepts (i.e. “the central dogma”). Participants should have an introductory foundation in statistics and/or machine learning as we will rely on ideas such as probability and inference. This tutorial will not cover the derivations of the algorithms under discussion, or require similar advanced mathematical skills.

Organisers

Matthew Biggs (Computational Biologist at AgBiome Inc. & Adjunct Assistant Professor of Biostatistics at the University of North Carolina; USA)

Publication perfect: painlessly creating powerful plots

Overview

intermediate level –– data visualization –– R and ggplot2

One of the biggest challenges in disseminating your research is visualizing the results in a way that is meaningful, easy to interpret and aesthetically pleasing. Oftentimes, the extensive time dedicated to generating experimental results can rival the creation and optimization of their figures. With a point and click environment, you can spend hours or even days tweaking the settings to get the perfect figure - only to realize that you now have to repeat this process for the remaining data. This process can be especially challenging when needing to perform customizations or when pivoting your figures to adhere to guidelines from conferences, journals or other publishing platforms.

In this tutorial, we introduce an efficient and reproducible workflow in R for creating publication-ready figures. We will introduce ggplot2 syntax to create custom plots, and we will explore how to determine the type of plots most appropriate for your data. We will explore how to ensure consistency between figures using custom theme and colour selections, with an emphasis on colourblind-friendly palettes from the RColorBrewer and viridis packages. We will also examine methods for enhancing our plots with functions from the ggpubr and cowplot packages, especially regarding layout and labelling of figures. Finally, we will conclude with an activity to use what we have learned to reproduce a published figure.

Expected Goals

Learn how to determine the type of plots that are best for your data
Appreciate the power and flexibility of ggplot2 to create custom plots
Know how to use custom functions and palettes to create figures with consistent themes, styles and colours
Understand how to use the R packages cowplot and ggpubr to easily add layouts and labels often required in published figures
Know how to save plots in a variety of formats

Learning Objectives

Determine the plot types best for visualizing a given dataset
Define the syntax for creating a plot using ggplot2
Generate plots for various data types using ggplot2
Explain how to create multiple plots using the same themes, styles, and colours
Discuss how to quickly alter figures to meet a different set of requirements (different journal or conference)

Schedule

Time	Activity
09:00 – 09:10	Introduce the instructors and scope of the workshop (lecture)
09:10 – 09:30	Introduction to the dataset (discussion) Discuss how to determine appropriate plotting methods for your data Describe the types of relevant plots you would like to include/create
09:30 – 10:15	Explore ggplot2 syntax and plots (live coding) Examine the ggplot2 syntax for a basic scatter plot Customize scatter plot by adding layers to the base plot
10:15 - 10:45	Coffee break
10:45 - 11:45	Discuss creating consistent plots (live coding) Create functions for themes to use with all figures Define colour palettes to keep colours consistent
11:45 - 12:15	Introduce features of cowplot for aligning and labelling plots (live coding)
12:15 - 13:30	Lunch break
13:30 - 14:15	Introduce features of ggpubr for adding statistical comparisons and ordering of plots (live coding)
14:15 - 15:10	Practice by walking through re-creating published/provided figure(s) (live coding)
15:10 - 15:50	Practice by changing code to adhere to a journal’s figure requirements (live coding)
15:50 - 16:00	Wrap-up and exit survey (lecture)
17:00	[BC]² Welcome lecture

Audience and requirements

Maximum number of participants: 18

This tutorial is for researchers interested in using R to create publication-ready figures. It is a hands-on tutorial in which the data and code will be distributed to participants who wish to follow along. All tutorial lessons and materials will be hosted on GitHub pages. Participants will be required to have R and RStudio downloaded and installed on their personal computers, in addition to any required R packages. This tutorial assumes an intermediate level of R knowledge.

Organisers

Mary Piper (Research Scientist and Associate Director of Training; Harvard T.H. Chan School of Public Health; USA)
Radhika Khetani (Director for Training; Harvard School of Public Health; USA)

Workshops

Workshops encourage participants to discuss technical issues, exchange research ideas, and share practical experiences on some focused or emerging topics in bioinformatics. Please note that some of the workshop schedules still need to be adapt to align with the start of the [BC]² Welcome lecture.

The following workshops have launched additional abstract calls to allow participants to present their work in the respective context:

Toward a common framework for annotated, accessible, reproducible and interoperable computational models in biology (submission deadline 15 July)
Federating computational analyses with GA4GH standards (submission deadline 30 July)
BioNetVisA: biological network reconstruction, data visualization and analysis in biology and medicine (submission deadline 16 August)

More information on how to submit your abstract to these workshops can be found in the workshops' descriptions.

All workshops take place at the Biozentrum -the new building- (University of Basel, Spitalstrasse 41, 4056 Basel).

Toward a common framework for annotated, accessible, reproducible and interoperable computational models in biology

Overview

beginner level –– systems biology –– modelling –– data annotation and curation

Computational models have long been used in Systems Biology to answer a variety of questions regarding the dynamical behaviours of complex systems. As the number of computational models rapidly increases, questions regarding models’ reproducibility and reusability, and model annotation in community-supported and standardised formats are needed more than ever. In [BC]² 2019, the Consortium for Logical Models and Tools (CoLoMoTo – http://colomoto.org) organized a workshop to develop community-driven guidelines and efforts for curation and annotation of logical models (Niarakis et al., 2020). Organised by members of the Consortium for Logical Models and Tools and Systems Modelling (SysMod - https://sysmod.info/), the proposed workshop aims to expand on these lines and bring together scientists from broader computational communities (multiscale, multicellular, and also quantitative modelling) to harmonize practices and foster interoperability and reusability of models.

Invited speakers
Several experts on data annotation, model curation, and community standard development will be invited.

List of confirmed speakers

Claudine Chaouiya (Aix-Marseille University, FR): Logical models
Dagmar Waltemath (Greifswald University, DE): FAIR, COMBINE
Henning Hermjacob (EBI, UK): BioModels
Falk Schreiber (University of Konstanz, DE): SBGN
Sarah Keating (UCL, UK): SBML
Anne Siegel (IRISA, FR): Metabolic network analysis
David Nickerson (University of Auckland, NZ): CellML and SED-ML standards
James Glazier (University of Indiana, Bloomington, USA): Multiscale models

CALL FOR CONTRIBUTIONS

A limited number of slots dedicated to oral presentations from selected speakers will be available after abstract submission and evaluation.

Abstract submissions for oral presentations should be sent to the following addresses anna.niaraki@univ-evry.fr, thelikar2@unl.edu, Laurence.Calzone@curie.fr and Sylvain.Soliman@inria.fr with the indication: [BC]² Workshop Submission.

Abstracts should include title, author affiliation and contact information and a summary of the work that should not exceed 600 words.

Important dates

15 July 2021 – Submission deadline
30 July 2021 – Selected speakers’ notifications

Please note that all registrations to the workshop have to be done through the [BC]² registration webpage.

Schedule

The workshop will be split into four sessions: The morning sessions will be dedicated to 1) model curation/annotation, and community standards development and 2) interoperability/reusability issues - tool requirements; the afternoon sessions will cover 3) Combine archives and SED-ML for logical models and beyond: requirements for global rules, and 4) model repositories: requirements for model deposit pre or post-publication.

Time	Activity
09:00 – 09:05	Welcome and introduction to the workshop
Session 1 - Model curation/annotation, and community standards (chaired by: Sylvain Soliman)
09:05 – 10:25	Building static & dynamic models in Systems Biology Anna Niarakis
09:25 – 09:45	SBGN - supporting humans in the modelling loop Falk Schreiber
09:45 – 10:05	Models for everything Sarah Keating SBML
10:05 – 10:15	Discussion
10:15 – 10:45	Coffee break
Session 2 - Interoperability/reusability issues and tool requirements (chaired by: Tomas Helikar)
11:15 – 11:35	When is a model FAIR – and why should we care? Dagmar Waltemath
11:35 – 11:55	Reproducibly reusing and archiving computational physiology models David Nickerson
11:55 – 12:15	Illustrating the needs for reproducible and interoperable logical models Claudine Chaouiya
12:15 – 13:30	Lunch break
Session 3 - Transparency in different modelling frameworks (chaired by: Laurence Calzone)
13:30 – 13:50	Metabolic network analysis Anne Siegel
13:50 – 14:10	WebMaBoSS: A web interface for simulating Boolean models stochastically Vincent Noel
14:10 – 14:30	Multiscale models James Glazier
14:30 – 15:00	Coffee break
Session 4 - Requirements for model deposit & publication (chaired by: Anna Niarakis)
15:00 – 15:20	Reproducibility in Systems Biology - Sometimes Henning Hermjakob
15:20 – 15:40	Title to be announced Tomas Helikar
15:40 – 16:00	Round table discussion
17:00	[BC]² Welcome lecture

Audience and requirements

Maximum number of participants: 50

This workshop addresses students and researchers interested in learning about best practices in data/model curation/annotation and community standard developments, giving them a unique opportunity to discover and discuss state of the art in the field.

It brings together scientists involved in BioModels, (a central repository of mathematical models of biological/biomedical processes), COMBINE (the COmputational Modeling in BIology NEtwork), CoLoMoTo (the Consortium for Logical Models and Tools), SysMod (the Computational Modeling of Biological Systems Community of Special Interest of the International Society for Computational Biology (ISCB)), SBGN (the Systems Biology Graphical Notation project), SBML (Systems Biology Markup Language), SED-ML (Simulation Experiment Description Markup Language), and in other relevant projects.

Organisers

Anna Niarakis (Associate Professor; UEVE, Univ Paris-Saclay & INRIA Saclay; France)
Tomas Helikar (Associate Professor; University of Nebraska; USA)
Laurence Calzone (Research Scientist; Institut Curie, U900 INSERM & Mines Paris Tech; France)
Sylvain Soliman (Researcher; Lifeware & INRIA Saclay; France)

A scientific committee has been assembled to select the presentations from the call for submissions. Denis Thieffry (ENS, Paris, FR); Rahuman S. Malik Sheriff (EMBL-EBI, London, UK); Ioannis Xenarios (UNIL, Lausanne, CH); Ina Koch (Johann Wolfgang Goethe‐University, Frankfurt am Main, DE); Juilee Thakar (University of Rochester, New York, USA); Benjamin Hall (UCL, London, UK).

Federating computational analyses with GA4GH standards

Overview

beginner level –– FAIR –– large-scale data analysis –– genomic and biomedical data

Vast data volumes, the lack of uniform data security standards, and the maze of infrastructure solutions and computational tools constitute significant hurdles in the life sciences' race towards efficient personalized healthcare on a global scale. Open Science, FAIR Policies, and broadly adopted community standards and best practices are widely recognized as effective methods to lower these hurdles.

This insight has motivated the establishment of the Global Alliance for Genomics and Health (GA4GH), an international standard-setting and policy-framing organisation dedicated to promoting a legal, technical and scientific framework for the ethical sharing and processing of personalized health data. Supported by 650+ organisations representing 50+ countries, the standards and policies set by the GA4GH are based on broad consensus across a wide range of different interest groups and cultures. The [BC]² conference in Switzerland, with its rich history and culture of federalism and the experiences gained hosting the GA4GH Plenary Meeting in 2018, is a perfect venue for a workshop dedicated to advancing the development, establishment and promotion of GA4GH standards-based federated cloud solutions for the large-scale analysis of genomic and biomedical data.

The workshop will feature:

an introduction of the relevant GA4GH API standards* with a focus on how they are currently used in ELIXIR as part of the ELIXIR::GA4GH Strategic Partnership (Jonathan Tedds, ELIXIR Compute Platform Coordinator)
a visionary keynote lecture on encryption technology (accepted by Nature Communications) suitable for analyzing sensitive data in a federated manner, with a broader focus on the benefits and risks of a globally federated research IT infrastructure and GA4GH standards (Jean-Pierre Hubaux, EPFL)
technical presentations on solutions that currently implement, plan to implement or would benefit from implementing relevant GA4GH standards
a panel discussion of session chairs and speakers with the aim of drafting a basic technical roadmap towards a federated GA4GH-powered compute and storage infrastructure on a national/European/global level
an open floor discussion to give all participants the chance to raise their concerns, share their ideas and give feedback on the panel discussion

List of confirmed speakers

Jean-Pierre Hubaux (EPFL, CH)
Jonathan Tedds (ELIXIR Compute Platform coordinator, ELIXIR Hub)
Michael Baudis (University of Zurich & SIB Swiss Institute of Bioinformatics, CH)
Mikael Linden (CSC, Finland)
Oksana Riba Grognuz (EPFL & Swiss Data Science Center, CH)
Shubham Kapoor (SIB Swiss Institute of Bioinformatics, CH)
Tazro Ohta (DBCLS, RIKEN IMS, Japan)
Viktória Spišáková & Lukáš Hejtmánek (MUNI, Czech Republic)

CALL FOR PARTICIPATION

There are still free slots for the workshop, including one slot for a technical contribution (18 min). Note that using GA4GH standards in your work is NOT a requirement at this point, an interest in adopting one or more standards will suffice. This is because we are explicitly aiming to increase the user/implementer base of the standards.

Submit contribution

Important dates

31 July 2021 - Abstract submission deadline
August 2021 - Selected speaker's notification

Please note that all registrations to the workshop have to be done through the [BC]² registration webpage.

Schedule

Time	Activity
09:00 – 09:09	Welcome
09:09 – 09:39	Introduction
09:39 – 10:15	Technical presentations I (2 speakers à 18 min)
10:15 – 10:45	Coffee break
10:45 – 12:15	Technical presentations II (5 speakers â 18 min)
12:15 – 13:30	Lunch break
13:30 – 14:30	Visionary keynote
14:30 – 15:15	Panel discussion on federate compute
15:15 – 15:45	Open floor discussion
15:45 – 16:00	Conclusion
17:00	[BC]² Welcome lecture

Audience and requirements

Maximum number of participants: 50

The expected audience includes (a) researchers that benefit from or require access to efficient federated computational analysis platforms to address important research questions, particularly in the field of personalized healthcare, (b) bioinformaticians, computational biologists, computer scientists and scientific software developers who are interested to tackle the scientific, technological and ethical challenges of large-scale biomedical data analysis openly and FAIRly, together with the global community, and (c) managers and administrators of scientific compute centers who are interested in providing access to federated, FAIR cloud computing infrastructure to their clients.

Organisers

Michael Baudis (Professor and Group Leader; University of Zurich & SIB Swiss Institute of Bioinformatics; Switzerland)
Katrin Crameri (Director Personalized Health Informatics; SIB Swiss Institute of Bioinformatics; Switzerland)
Alexander Kanitz (Co-lead of the ELIXIR Cloud Initiative; University of Basel & Swiss Institute of Bioinformatics; Switzerland)
Shubham Kapoor (Lead System Architect; SIB Swiss Institute of Bioinformatics; Switzerland)

Next to the general frameworks provided GA4GH, ELIXIR/SIB and the SPHN, related/similar initiatives, stakeholders and projects include the GA4GH Driver Projects and other national and international initiatives, such as the 1+ Million Genomes and the Beyond 1 Million Genomes (B1MG) projects, CINECA, the European Open Science Cloud (EOSC), EUCANCan, H3Africa, various NIH initiatives, the Personal Health Train, as well as various IT, bioinformatics and pharmaceutical companies. What unites them is a common vision for a globalized, affordable and ethical healthcare system that is able to tackle complex medical problems such as cancers, rare diseases and pandemics and that is driven by scientific and technological innovation emerging from open discourse in a community effort.

BioNetVisA: biological network reconstruction, data visualization and analysis in biology and medicine

Overview

beginner level –– biological networks –– molecular interactions and pathways –– modelling –– annotation, curation and contextualisation

Today's biology is largely data-driven thanks to high-throughput technologies that allow investigating molecular and cellular aspects of life on large scales. Making biological sense out of the amount of produced data requires their interpretation in the context of biomolecular networks that govern cellular and physiological processes. In parallel to this technological revolution, the last decades have seen the accumulation of considerable knowledge about those processes and their role in the health and diseases.

The goal of BioNetVisA workshop¹ is to bring together the different actors of network biology, whether database providers, experimental biologists and clinicians involved in systems biology approaches, as well as computational biologists involved in data analysis and modelling. The participants will be exposed to the different paradigms of network biology and the latest achievements in the field. The BioNetVisA workshop also aims at identifying bottlenecks and proposing short- and long-term objectives for the community as discussing questions about accessibility of available tools for wide range of user in every-day standalone application in biological and clinical labs. In addition, the possibilities for collective efforts and future development directions will be discussed during the round table panel.

For more information on this and past BioNetVisA workshops, please see https://bionetvisa.github.io.

Topics related to this workshop

Graphical representation of biological knowledge
Molecular interaction and pathway databases
Comprehensive signalling networks
Networks annotation and curation
High-throughput data visualization, analysis and interpretation in the context of networks
Multi-scale networks (genome, epigenome, transcriptome, proteome, metabolome, lipidome..)
Contextualization of networks (species, diseases, developmental stages…)
Networks of inter-cellular communication
Network modelling
Machine learning/Artificial Intelligence approaches in network biology
Basic research and clinical application of networks
Microbiome and networks
Single-cell data and network inference
Networks for drug repositioning and disease coomorbidity

CALL FOR CONTRIBUTIONS

Abstracts for a talk or a poster in the topics described above can be submitted until 16 August 2021.

Submitted abstracts will be reviewed by the scientific program committee and a notification of acceptance for a talk or a poster will be provided to the corresponding author by 20 August 2021.

Abstract format:

Title
List of authors (the first author is the presenting author)
Affiliations
Abstract text in unstructured format (maximum 300 words)

Submission process

Abstracts can be submitted via the EasyChair submission page. Once logged in, click on the 'Submission' tab to start the submission process. Enter authors, title without HTML elements, abstract up to 300 words, keywords, and indicate three-five relevant topics.

To assist us with creating the abstract booklet, please upload the same abstract as a word document in the section called "Upload Paper" using this TEMPLATE.

When the form is adequately filled out, press the 'Submit' button. You will receive an email from the conference "EasyChair". This email is solely a notification that EasyChair for BioNetVisA2021 has received the abstract.

Important dates

1 August 2021 - Abstract submission deadline
5 August 2021 - Selected speaker's notification

Schedule

Time	Activity
Session 1
09:00 – 09:30	Looking Closely at Molecular Interactions Henning Hermjacob (EMBL-EBI, Cambridge, UK)
09:30 – 10:00	Fine tuning a logical model of cancer cells to predict drug synergies: combining manual curation and automated parameterization Åsmund Flobak (NTNU, Trondheim, NO)
10:00 – 10:15	A novel approach to maximize network lifetime using novenary network model in WSN Sushree Pradhan (Sambalpur University Institute of information Technology, IN)
10:15 – 10:45	Coffee break
Session 2
10:45 – 11:00	Enhancing the usefulness of transcriptome data in the context of Leishmania-infected macrophages with pathways biocuration and mathematical modelling Julieth Murillo Silva (Pontificia Universidad Javeriana-Cali, CO)
11:00 – 11:30	Evaluating the Reproducibility of Single-Cell Gene Regulatory Network Inference Algorithms Laura Cantini (IBENS – ENS, Paris, FR)
11:30 – 12:00	Regularization for sparsity, biological priors and neural networks in cancer survival models Valentina Boeva (ETH, Zurich, CH)
12:00 – 12:15	Kinase interaction network expands functional and disease roles of human kinases Marija Buljan (Empa Materials Science and Technology, Zurich, CH)
12:15 – 13:30	Lunch break
Session 3
13:30 – 14:00	Personalized anti-cancer drug treatment choice using RNA-seq and network analysis Alexander Kel (geneXplain, Wolfenbüttel, DE)
14:00 – 14:30	Quantitative systems pharmacology model-based drug target discovery for influenza Thomas Helikar (University of Nebraska, Nebraska, US)
14:30 – 15:00	Molecular interaction networks controlling neural stem cells Rupert W Overall (Technische Universität Dresden, DE)
15:00 – 15:15	Prediction of intratumor transcriptional heterogeneity from bulk tumors Agnieszka Kraft (ETH, Zurich, CH)
15:15 – 15:30	An explainable deep learning approach to kidney cortex cell classification Thomas Chen (Academy for Mathematics, Science, and Engineering, Rockaway, US)
15:30 – 15:45	Graphia: A platform for the graph-based visualisation and analysis of high dimensional data Tom Freeman (The University of Edinburgh, Edinburgh, UK)
15:45 – 16:00	Round table
17:00	[BC]² Welcome lecture

Audience and requirements

Maximum number of participants: 50

The workshop targets computational systems biologists, molecular and cell biologists, clinicians and a wide audience interested in update and discussion around current status of network biology, pathway databases, and related analysis tools, including visualization, statistical analysis and dynamic modelling.

No computational background is required to attend the workshop. The round table panel planned at the end of the workshop will be a forum for live discussion around those topics.

Organisers

Emmanuel Barillot (U900 Institut Curie - INSERM & Mines ParisTech; France)
Hiroaki Kitano (RIKEN Center for Integrative Medical Sciences; Japan)
Inna Kuperstein (U900 Institut Curie - INSERM & Mines ParisTech; France)
Andrei Zinovyev (U900 Institut Curie - INSERM & Mines ParisTech; France)
Samik Ghosh (Systems Biology Institute - Tokyo; Japan)
Robin Haw (Ontario Institute for Cancer Research; Canada)
Alfonso Valencia (Barcelona Supercomputing Centre; Spain)

¹ BioNetVisA is an annual workshop series to bring together different actors of network biology from database providers, networks creators, computational biologists, biotech companies involved in data analysis and modelling to experimental biologists, clinicians that use systems biology approaches. The participants are exposed to the different paradigms of network biology and the latest achievements in the field. The workshop takes place in the context of major international conferences in the field of Computational Biology and Bioinformatics such as [BC]², ECCB or ISCB.