Hackathons – not only for techies
The Leiden University Data Management Network enjoyed a safari tour around the data hackathons that have recently either taken place at Leiden, or involved people from Leiden. After clarifying what hackathons really are, and how everyone benefits, we brainstormed some ideas for future hackathons.
On Thursday 17th March, the Leiden Data Management Network met for "Connect & Inspire: Learning through and from hackathons". Daniela Gawehns (PhD candidate at LIACS, the Leiden Institute of Advanced Computer Science) shortly introduced the concept of hackathons in general and then, together with Digital Scholarship Librarians Ben Companjen and Kristina Hettne from the Leiden University Libraries, Centre for Digital Scholarship, shared experiences in organizing and participating in hackathons, such as the Reprohack, HackaLOD, Metadata for Machine workshops (M4M) and Bring Your Own Data workshops (BYOD). The event ended with a discussion about why, how, and when, to organize a hackathon in the context of data sharing or data management.
“Pizza”, “collaborative”, “hands-on”, “intense”, “whizz-kids”, maybe even “illegal” – these are some of the words we associated with a hackathon at the start of the session. Certainly, most hackathons can be exciting, hard work, fun, and productive, and they can bring many benefits to participants and organizers alike. But in a controlled setting, they can offer a serious and highly effective means of helping researchers manipulate data and learn new data skills.
But first, to set the record straight, not all hackathons are tech-oriented, although when it comes to data management, that’s generally what we’re talking about.
Second, there are participants, of course, but just as important are the organizers and what they gain from such an event. Some hackathons might also have a competitive element and therefore jury members, who might be experts and who might present, give feedback, attract participants, or produce other benefits. So a range of people, with different skills, not just self-organised whizz-kids, might be involved in a hackathon.
What is there to gain for a participant?
They may be competing, and hoping to win something. They may come to make new connections or network with colleagues, and they may want to try out new technology. They expect to have fun and learn something new along the way. The environment allows people to experiment with a safety net so there are no risks for making mistakes or wasting time.
How do organizers benefit from hackathons?
They get to bring people together, develop their relationships with colleagues, and meet and get to know new contacts, and they hope to gain new insight and ideas for solutions related to what they do. They may also get access to the outputs: prototypes or solutions that are developed in the hackathon, and/or feedback on materials (technology, data, or new programs) that they have produced.
And there may be other gains, depending on the goals and motivations of the individuals, and on the type of hackathon.
To look at some examples:
HackaLOD (Hack on/with Linked Open Data) is an almost yearly event organized by the Network Digitaal Erfgoed. It’s an exciting event to take part in: it lasts 24 hours, overnight, in an exciting location – one time at a prison, another time in Radio Kootwijk, a beautiful old building that was a radio communications transmitter in the early 20th century. There is an open bar all the way through for coffee and drinks, great (and healthy!) food served, plus entertainment breaks – for example a band played at 4am to keep people awake. Camping beds are available if you need a rest, though they’re not the most comfortable for a good night’s sleep, and some participants prefer to keep on going throughout the night. The jury of experts drops in on participants towards the end of the event to hear their ideas and see the results.
Roughly ten teams compete with about five members per team, hoping to win prizes of 1000 Euros for the jury winner, and 500 Euros for the audience favourite. A lot of publicity is organised around the event, and the final presentations therefore gain quite an audience. For an impression see pictures by Jacqueline van der Kort and blog by Ben Companjen (in Dutch).
A core team of between five and seven people have been organizing the Reprohacks so far. They're described as a sandbox event where people check the computational reproducibility of donated work. Authors 'donate' their academic articles, and participants try to hack, or reproduce, the results from the data: reproducibility-hackathons, hence the name. Although there’s no financial prize, it’s a win-win – authors receive feedback reports on their reproducibility efforts, and participants learn from authors.
Usually they are one-day events, but as they’re not necessarily full time, they could also be week-long events. An upcoming event in Warwick, UK, will be week-long to allow for running time on the a high-performance computing facility.
In 2019, the first Leiden Reprohack event took place in person, and this was the biggest so far, but during the pandemic Reprohacks have been continuing online. To make it easier for others to organize Reprohacks, there’s now a hub, and through this site, papers can be shared centrally and support given to organizing teams.
With the development of this hub, Reprohacks are catching on around Europe – most recently in Germany, and soon the Warwick event, and another in Bern, Switzerland. In order to help the take-up of reprohacking in The Netherlands, and show people how to run an event of their own, Kristina Hettne will give a “How-to-Reprohack” tutorial at the upcoming LCRDM networking day on 29th of March in Utrecht.
M4M and BYODs
These acronym-heavy events are hackathons by another name.
Support for FAIR data often extends to asking researchers to write a Data Management Plan, training them in FAIR awareness, and answering specific questions in one-to-one consultations. These interactions are primarily aimed at helping researchers to make data packages available in a repository, but how can we support researchers in making choices about their data structure and metadata earlier in their research?
M4M (Metadata for Machines) is a workshop where researchers and data stewards work together to make decisions on a metadata schema. Although they’re not being called hackathons, they have much in common – people working online together to produce something new and practical. M4M workshops are best timed early in the research project so that data can be collected in a way to make FAIR data quicker and more easily produced. The driving force behind M4Ms is GO FAIR, in collaboration with, among others, the Leiden University Libraries, Centre for Digital Scholarship, and the Dutch health funder research, ZonMw.
BYOD (Bring Your Own Data) workshops also have much in common with hackathons. The focus is on working collaboratively and intensely to make a dataset more FAIR. It brings together experts on data modelling and handling, with researchers who bring their own data so that they can learn whilst using a real case study. The hands-on work on the dataset is interwoven with presentations, tutorials, and learning outcomes. The aims are to engage in more than just teaching, and to deliver more than just help solving one problem. They were first applied in the life sciences domain but were quickly adopted by other domains as well. The first BYOD at Leiden University Library was given in 2019.
Making hackathons attractive for everyone
We have busted a few myths: hackathons are not just for whizz-kids, they're certainly not illegal, and are not necessarily intense, night-time events, dependent on fast-food and heavy caffeine intake. But some of the imagery is negative, and can be off-putting for the audience at which they’re aimed. Hackathons are always intended to be welcoming and focused on learning, so how can we make hackathons attractive?
What’s in a name? Are there any better ways to describe such an event?
With the connotations associated with hacking, it's possible that the name itself could put people off. Other names are being used for these kinds of events, such as datathons, or workshops and although these might sound less threatening, they imply none of the excitement of participating in a ‘hackathon’. What’s missing are exactly those images of surprise, discovery, excitement, and venturing into the unknown, that are conjured by the term 'hackathon'. So if we want people to feel excited about a collaborative event, then perhaps we should more often call them hackathons.
Conversely, we shouldn't use the term 'hackathon' too liberally to describe teaching events: if you’re only using structured materials to deliver information, then the event can’t be considered a hackathon; there has to be an element of ‘wildness’, of unexpectedness, and maybe even of uncontrollability.
If we want to make hackathons more attractive to new participants, and to those who may feel their skills are not yet good enough to contribute, what can we do?
A lot comes down to how the event is promoted, and as the number of events proliferate, it might become easier to get videos, interviews, profiles of participants, and photos, that can show others what’s involved. Clarifying the ways in which you can participate, what skills you need (or don’t need), and how you can fit into a team, will also help.
The day itself might be designed in a way that welcomes people who chose not to participate, but wonder whether this is something they’d like to do in future. Including opportunities for an audience to participate, might be a way to open up events in the future, for example, making presentations open to an audience as well as to the participants, inviting observers to social breaks, or creating interaction moments with participants during the event.
Organising future hackathons
Bearing in mind all that we’ve learnt, we brainstormed ideas for future hackathons. A suggestion was made that pseudonomization and anonymization of data is a challenge that would lend itself very well to a hackathon style event. Data Protection Officers and Data Stewards would happily collaborate on such an event, which would give them insight as well as being a way to show people how to work with their personal research data. It would also be a way to gain PR for the services and support that are available in relation to personal data.
For people who’ve never organized a hackathon, it’s useful to get assistance from people who’ve previously organized a similar event. Arranging communication, planning the structure of the day, making the preparations for it to work, helping out with budget and logistics, but also helping on the day with the technical running of the event.
The hackathon planning kit is a great starting point that guides organizers through the planning process with 12 key questions.
When it comes to location, it’s worth thinking about a shared space that’s big enough for the teams to work together, but also has space for socializing. The first Leiden University Reprohack took place in the University Library, and this gave it, not only the space needed, but was also a neutral location for participants. If we were looking to collaborate with other universities, or broaden participation beyond Leiden, then it might be worth investigating a location more central in The Netherlands. The exciting HackaLOD locations give a feeling of seclusion, and of being in a pressure cooker, which is a wonderful atmosphere for these style events, but obviously comes at a cost.
There was a lot of excitement about the idea of experimenting with hackathons, or hackathon-style events, at Leiden University in future, so watch this space!