A One-Year Retrospective of Wikidata Software Collaboration

Prologue

Imagine a world where everyone, regardless of their background or language, can seamlessly access structured, interconnected knowledge. This vision drives the creation of Wikidata, a free and open knowledge base accessible and editable by both humans and machines.

One of the emerging projects within Wikidata is Lexicographical data project. Imagine a hyper-connected “dictionary” that serves both computers—like large language models (LLMs) such as GPT—and humans. This project has the potential to transform the way we understand and visualize languages.

When I first started using Wikidata and discovered the Lexemes project, I was thrilled by its potential. However, I quickly became frustrated by its complex interface. Finding information was challenging, and utilizing advanced features like querying was confusing. The Wikidata development team probably has known it for quite some time, but addressing it on the scale of Wikidata requires immense resources and effort.

I will discuss about the UX research and prototyping process for a new Wikidata tool that aims to make the Wikidata Lexemes project more accessible and user-friendly. I will also share the lessons learned throughout this journey.


Wikidata Software Collaboration

Wikidata Software Collaboration is funded by the Arcadia Philanthropic Trust and initiated by Wikimedia Deutschland to bring forward the movement strategy initiatives in collaboration with Wikimedia Indonesia and Igbo Wikimedians User Group.

The project started in July 2022, and I joined in August of the same year. After 12 months of collaboration, we’ve achieved significant milestones—we found out what our community members are like, the way they interact and see Wikidata, then figuring out a solution that can be developed in a realistic timeframe. Our iterative approach, using agile methodology, enables us to create something novel yet useful, and hoping to engage the community in development, testing, and feedback.


Research

Our initial goal was to find ways to help people contribute, reuse, and improve the quality of data within Wikidata Lexemes. We started by conducting user research to identify the needs and challenges faced by Wikidata editors in Indonesia.

This research involved semi-structured interviews with 15 Wikidata editors in Indonesia. They range from beginners to experienced contributors who had joined the local Wikidata community or attended Wikidata events. The interviews were conducted remotely using Zoom. We designed the interviews to explore editors’ mental models, user journeys, challenges, and perceptions of the project and ask them to do short activities related to lexeme search and contribution.

We synthesized the research findings into an affinity diagram, grouping similar ideas and concepts, which will inform the design of our new tool. Based on this research, the tool should be straightforward, mobile-first, and offer support for users.

Key findings

“If developed well, the (Wikidata) Lexemes project will produce extraordinary results! The data can be used as interesting research materials.”

A participant’s opinion about the future of Wikidata Lexemes

“Everything is done voluntarily by volunteers. Contribute, no matter how small it is. If there’s a mistake, someone will fix it, no matter how small the mistake is. Don’t be judgmental and don’t be emotional.“

A participant’s opinion about core values of a wiki community

Wikidata & Wikidata Lexemes

Steep learning curve

Understanding how to use Wikidata, particularly for lexeme search and contribution, is challenging.

Mobile editing problems

They often use their mobile devices to contribute to Wikidata. Unfortunately, Wikidata Lexemes cannot be edited on mobile devices except in desktop mode.

Translation and jargon

Translating content into local languages is crucial, and technical jargon must be explained clearly.

Data quality concerns

Issues such as duplicates and incomplete statements are common, often caused by editors who are unsure of what to contribute.

Unique language properties

There are concerns about accommodating non-Latin characters, various registers, and dialects specific to local languages.

Compare and contrast

The community has noted similarities between Wikidata Lexemes and other lexicography tools like WordNet, as well as comparisons with other Wikimedia projects such as Wiktionary, Wikisource, and Wikipedia.

Community

Indonesians are active Wikidata editors

Compared with other Wikidata communities, Indonesian editors are notable active.

Proud of original contribution

They prefer creating items or lexemes from scratch because they’re proud of their original contributions.

Diverse interests

The community includes both linguistics-oriented and tech-oriented individuals.

Comuunication channels

They communicate via WhatsApp and Telegram, with announcements also reaching members through Instagram and Twitter/X. They prefer short, visually engaging content.

Wants to be heard

The community wants to be heard but they don’t know how to give feedback and push for change.

Knowledge transfer issues

There is limited knowledge transfer between experienced members and newcomers, affecting new editor retention and their knowledge gap.

What to edit?

There is no consensus on what to contribute on Wikidata Lexemes, especially for local languages.

Additional discussions with Wikimedia Deutschland

We discussed the findings with Wikimedia Deutschland’s UX team, focusing on several key points.

  • The abundance of tutorials and training needed for newcomers shows that the current Wikidata interface is not user-friendly. This indicates a need to reduce the barrier of entry.
  • Mobile device support is important. We need to collect data on Wikidata users who edit from them. This can help us understand mobile usage patterns and highlight the need to improve their experience.
  • Even users with a linguistic background have found the current technical jargon too technical. This raises the issue of balancing accuracy, precision, with user understanding.
  • The current wiki culture may influence the existing overincentivization of individual contributions. As UX designers, we cannot address this cultural factor easily, but it is a good idea to keep this in mind.

Demographics

Here are some insights that we can get from the data:

  • Indonesian Wikidata editors tend to skew younger.
  • All of them used laptops and the majority of them also edit on mobile devices.
  • Most of them are involved in other Wikimedia projects, particularly Wikipedia, Wikisource, and Wikimedia Commons.
  • Most of them live in more developed parts of Indonesia such as Java, Bali, Sumatra, and Kalimantan.
Cover page of the report Indonesian Wikidata Research Participant Demographics

Persona

The Indonesian Wikidata community can be divided into five categories regardless of editor tenure.

Five Indonesian Wikidata user profile
  • Andi, The Data Utilizer: Tech-oriented, academic users of Wikidata for research and product R&D.
  • Sandi, The Teaching Linguist: Teachers and educators, linguistics-oriented people who want to learn deeper about languages.
  • Hendra, The Affable Maintainer: Community members who become the community’s cornerstone because of technical and people skills.
  • Shinta, The People Inviter: Community members who get new members onboard and skilled in social media and communication.
  • Joni, The Competitive Editor: Resourceful members who self-improve and continuously contribute out of passion.

Ideation

After the research, we need to come up with ideas for a new Wikidata tool. We started by looking on well-loved tools within the community, such as ISA Tool and Lingua Libre. ISA Tool is a mobile-first website that helps users connect images uploaded to Wikimedia Commons with relevant Wikidata statements. Lingua Libre enables users to submit pronunciations of words to Wikimedia Commons in multiple languages.

We then used FigJam to conduct an ideation workshop. First, we assembled a visual moodboard to inspire creativity and set the tone for our brainstorming, then generated a list of words and phrases related to the moodboard and created a word cloud to visualize key concepts. Next, we did brainwriting, a technique where participants recorded their ideas on sticky notes and then shared, discussed, and categorized into themes. Lastly, we did a simple feasibility analysis based on factors such as innovativeness, scope, required resources, and time constraints.

This ideation resulted in three ideas: a mobile-first contribution tool to simplify Wikidata Lexemes contributions, a gadget to recommend lexemes for addition or editing, and a learning app using flashcards for language and lexicography.

After discussing these ideas with Wikimedia Deutschland, we concluded that the mobile-first contribution tool was the most feasible idea. The gadget to recommend lexemes idea was too complex to make, and the learning app’s scope was too broad.


Prototyping

Next, we created an initial prototype for the new tool using Figma. This prototype featured:

  • A streamlined, mobile-first interface inspired by stacks of playing cards, designed for quick and easy contributions on the go.
  • Limited scope of contributions, such as adding antonyms, that can be expanded over time.
  • Community-curated topics (planned).
  • Randomized daily contributions (planned).
  • Support for multiple languages (planned).
Prototype features of the tool that was about to be developed

Post-prototype discussion

After sharing the initial prototype with Wikimedia Deutschland, we discussed several considerations:

  • Gamification can be a double-edged sword. Badly designed gamification can give people the wrong incentives to edit. We must carefully design gamification elements to encourage meaningful collaboration rather than just the volume of contributions.
  • The tool needs to support users who don’t use touchscreens or have motor disabilities. We must incorporate alternative methods of interaction beyond gestures.
  • Ensure the tool is responsive so it works well on both mobile and desktop devices.
  • Start with developing core features, gather user feedback, and iteratively add new features over time.

Next steps

To continue working on the project, we need to:

  • Develop a detailed MVP of the mobile-first contribution tool to test its basic functionality and gather user feedback.
  • Conduct usability testing with a broader range of participants to assess the design and make improvements, ensuring the tool is user-friendly for everyone.
  • Develop, launch, and iterate on the tool using agile methodology, continuously refine it based on community feedback.
  • Engage the community throughout the development process to maximize impact, ensure accountability, and maintain transparency.

Conclusion and personal wishes

Perfect is the enemy of good. Good enough products are enough.

Working on this project has been a profound learning experience. I’ve learned how to collaborate with people from diverse cultures, principles of agile methodology, and the stories behind Wikidata development. I’ve also learned how to connect with various communities and partners, and how to brainstorm, ideate, and iterate in innovative ways. I am grateful for this opportunity and eager to see what the future holds.

I am confident that this project can thrive as a community-driven initiative. It can be sustainable and inspire others to create well-researched and thoughtfully designed products, even when the initial idea sounds simple on paper. Working on community efforts can be challenging, but the rewards are significant.

Language is more than just a system for communication; it’s a tool for achieving shared understanding. Throughout this project, I’ve learned that effective communication involves not only choosing the right words but also understanding the context and showing empathy to whom we’re communicating with.

I invite anyone interested in contributing to this project to learn more about this collaboration. Your feedback on the prototype would be greatly appreciated, as it will help shape the future of the tool.

Photo of people associated with Wikimedia Software Collaboration project.

Thank you for reading until the end and here’s for the next 12 months and more for this project!