Main Page
SWAT4HCLS virtual hackathon
Welcome to the virtual hackathon of SWAT4HCLS 2021
This hackathon will be hosted on January 15th and January 19th 2021. It is a two days event. During these two days, participants will be able to discuss, hack and/or collaborate on various topics related to the Semantic Web in the life sciences.
Platforms
We will host the following platforms to allow online collaboration
Zoom
The opening and wrapping up sessions will be hosted on zoom: https://us02web.zoom.us/j/84103126982?pwd=QzVSenhRTHZFQk9paEZ6ZEhjVXJlZz09
Discord
On Discord we have created a SWAT4HCLS section. Here it is possible to host text and voice conversations. Within the voice conversation, it is possible to share screens. The SWAT4HCLS channel can be accessed by following this link. There are already a set of preset channels.
Remo
Inspired by the previous edition of the biohackathon we are also hosting the event on Remo.
Live streaming of presentations and tutorials
It is possible to present results or give a short tutorial or demo. Participants interested in using this opportunity, please reach out to us on Discord to discuss a streaming session. These sessions will be streamed live and will remain available online afterward.
Registration
The event is free, but registration is required. Please through Eventbrite
Agenda
Friday January 15th
time | description | links |
---|---|---|
09:00 | welcome, introduction and pitches | (Zoom) |
hacking, collaborating, writing papers, etc | (Discord) (Remo) | |
12:30 | report back | (Zoom) |
hacking, collaborating, writing papers, etc | (Discord) (Remo) | |
17:00 | report back | (Zoom) |
hacking, collaborating, writing papers, etc | (Discord) (Remo) | |
21:00 | wrapping up | (Zoom) |
hacking, collaborating, writing papers, etc | (Discord) (Remo) |
Tuesday January 19th
time | description | links |
---|---|---|
09:00 | welcome, introduction and pitches | (Zoom) |
hacking, collaborating, writing papers, etc | ||
12:30 | report back | (Zoom) |
hacking, collaborating, writing papers, etc | ||
17:00 | report back | (Zoom) |
hacking, collaborating, writing papers, etc | ||
21:00 | wrapping up | (Zoom) |
hacking, collaborating, writing papers, etc |
Pitches
Pitches for what to work on are welcome. Please add your pitch below or in the following GDoc
Title: Using FAIR in healthcare
SWAT has recently expanded to the healthcare realm. Yet, despite efforts from many people including me, FAIR is still not really used in healthcare. This is due to many factors, but making data FAIR is still one of the biggest bottlenecks in implementing FAIR. I would therefore like to brainstorm about strategies that would make this process easier and therefore find more traction for FAIR in clinical practice
Discord: Chat and Voice channels Pitch by Rianne Fijten
Title: Creating subsets from Wikidata.
During the past editions (2019-2020) of both the biohackathons and SWAT4HCLS we worked on extracting subsets. During the last virtual hackathon of the biohackathon we were able to extract subsets from using Shape Expressions from Wikidata. We would like to use the SWAT4HCLS hackathon to continue the developments, i.e. finalizing the above-mentioned pipeline, but also deploy the workflow on a set of use cases.
Channel: chat Pitch by Jose Emilio Labra Gayo
"Continue collaboration around subsetting and layering of open knowledge graph data using Docker-packaging, Wikidata subsetting, and schema mapping strategies to make it easier to mix-and-match SPARQL databases and datasets. For example, extracting lifesciences-oriented wikidata content, translating to bioschemas / schema.org vocabulary, and mixing with crawled bioschemas data from Proteomics web sites. Discussion topics could include questions around best practices and documentation for packaging (docker etc.) the combined datasets alongside software such as SPARQL databases and querying clients."
Channel: chat Pitch by Dan Brickly
Title: Expressing Wikidata as indexed binary RDF
Pursuant to Wikidata subsets, the [HDT (Header Dictionary Triples](https://www.rdfhdt.org/) binary RDF format provides random access to RDF data. This allows programs to work with large datasets using limited memory. The goal is to express Wikidata in HDT and develop tools (e.g. ShEx validators) to consume it without burdening the BlazeGraph SPARQL server.
Channel: chat Pitch by Eric Prud'hommeaux
Youtube: https://youtu.be/Wd_sSecDAE8
Title: Github actions to check semantic web artifacts
This pitch is inspired by pull request review work done in this semantic model repository. During the pull request review process we did do systemic manual checks like checking syntaxic validation of shex and turtle files, removing redundant contents and unused prefixes. So in this hackathon we would like to explore options to make github actions to automatically do some of our systemic manual checks.
Channel: chat Pitch by Rajaram Kaliyaperumal
Youtube https://youtu.be/Ou4wBtSLbDQ
Title: Linking Complex Portal with other FAIR resources
Complex Portal is a resource to collect and curate knowledge around macromolecular complexes in model organisms. During this Hackathon we would like to propose a project on making the Complex Portal more interoperable/FAIR. This includes linking Complex Portal entries with linked data resources such as Wikidata, WikiPathways, Scholia and the (COVID-19) disease map projects. Please join us…
Channel: chat Pitch by Birgit Meldal and Egon Willighagen
Youtube https://youtu.be/g_YgVZgSzYQ
Help us collect health-related content from Twitter, Reddit and other social media. Especially during the COVID-19 epidemic, user-generated content is becoming the most relevant and up-to-date source of health information. We would like to use this data to benchmark our self-supervised phenotyping methodology. During the hackathon, we will propose a demo assessment of our methodology on the clinical data.
Channel: chat Pitch by: Vibhor Gupta
Youtube https://youtu.be/8ebZ7q81wbw
Title: ShapePaths to Access Schemas and the Data They Validate
ShEx has a practical JSON structure which could in principle be access via some JSON Path language (e.g. [this one](https://support.smartbear.com/alertsite/docs/monitors/api/endpoint/jsonpath.html)). However, the things we want to say with such a path language (identify triple constraints in a shape, create error reports which specify the navigation of shapes and properties that created the error, extract triples and RDF terms from RDF graphs validated by parts of a schema identified by a path) call out for a more specialized syntax that will be both terse and intuitive. The goal is to continue development of the ShapePath language, as driven by use cases.
Channels chat Pitch by Eric Prud'hommeaux
Title: putting UMLS Metathesaurus into a Wikibase instance
UMLS combines several known terminologies such as ICD and SNOMED into one big meta-thesaurus. Having triplestore and ElasticSearch capabilities behind such a rich resource could be of great value. The idea is to create a workflow docker image that takes a downloaded UMLS version as input and runs in combination with a Wikibase docker instance. The workflow docker image would create the necessary predicates and then would put the whole UMLS metathesaurus into the Wikibase instance.
Channels chat Pitch by Andreas Thalhammer
Title: Exploring grlc and Salad to align Web APIs with Linked Data
Web APIs are the most extended way of enabling programmatic access to data on the Web and Linked Data is the structured data underlying the Semantic Web. However, Web APIs usually rely on JSON or YAML structured data documents and the implementation of Web APIs around Linked Data is often a tedious and repetitive process. Recently have appeared grlc and Salad resources to bridge this gap. grlc is a tool to automatically convert SPARQL queries into Web APIs. grlc is a lightweight server that translates SPARQL queries stored and documented in GitHub repositories to Linked Data APIs on the fly. Salad is a schema language that describes rules for preprocessing, structural validation, and link checking for documents and provides a bridge between document and record oriented data modeling and the Semantic Web. The goal in this hackathon is to play around with both tools driven by use cases such as the alignment of the Query Builder Web API and the RDF patient registry data that are under development in the European Joint Programme on Rare Diseases ([EJP RD) project. If you want to cook something interesting in this hackathon, join us!
Channel: chat Pitch by Núria Queralt Rosinach and Rajaram Kaliyaperumal
Youtube https://youtu.be/gK4bJ9xkZDY
Title: Adding logical structure to the COVID-19 epidemiology ontology
Rapid analysis of epidemiological data is necessary to monitor disease outbreaks and to allow public health institutions and governments to make timely evidence-based decisions. But, the COVID-19 pandemic brought into focus the need to efficiently find, access, share and re-use COVID-19 epidemiological data. The COVID-19 epidemiology ontology has started to be developed at the last BioHackathon-Europe 2020 (proposal 30) to have these data as FAIR as possible. Currently, it is a plain list of curated terms mapped to terms in OBO ontologies. The goal in this hackathon is to continue its development by defining and implementing axiom patterns, please join!
Channel: chat Pitch by Núria Queralt Rosinach
Youtube https://youtu.be/gK4bJ9xkZDY
Title: Converting REACTOME database in JSON-LD and explore the potential in ElasticSearch/Siren investigate
Making datasets available in RDF format was at the heart of the Bio2RDF project. Now with the availability of JSON-LD standard and the scalable NO-SQL technologies (mongo, neo4J, elasticsearch), how can the Life Science Linked Data community could benefit from those newer approaches? Our goal is to create a Biomart like user interface built around Reactome, CHEBI, Uniprot and GO databases all interlinked with URIs.
Technologies: Python, JSON-LD, ElastiSearch, Siren Investigate
Channel: chat Pitch by: François Belleau
Youtube https://youtu.be/Wd_sSecDAE8