GSoC: Work on Freifunk API Query Client will go on

This is the final blogpost for my GSoC project for the Freifunk-API Query Client.

Goals
 
We want a comfortable tool to query all the Freifunk API files as there are nearly 100 communities all over Germany providing their data. There are already several applications like our community map, a common calendar, our feed aggrator or the community podcast collector. But it’s still hard to find communities by properties like routing protocols or focus topics.
 
Challenges
 
When we began this project we only planned to query the generated JSON data for the community in a browser and additionally provide query results via a webservice. But then we talked to several people and we heard about DeepaMehta with features like connectors to OpenStreetmap. So we did something what you don’t do normally: We changed our project goals before the midterm evaluations.
 
DeepaMehta is not just another database product, it provides a different way to store and handle data. It uses a graph to store connections between items and allows to modex complex datatypes and associations between them. We had to change our mind and had to learn a new kind of thinking. The API data is constantly evolving and changing and there a lot of cross-references in the data e.g. links to various nodemaps. We think the switch to DeepaMehta is useful because we can query the graph and add new relations and data without problems.

 
It’s difficult to handle different spec versions if you want to query all API files, because some fields changed, other fields were added to the specs or got another meaning. In an ideal world all communities update their files as soon as possible. But we all know, it will never happen like that. As a workaround we first focused on less fields, available in all versions.
 
What we got
 
We’re able to import communities from the API directory as a base entity. We also tried some different ways to import and store the specs, but we need some improvements here. By using the summarized API file, the import of our payload can be done via the DeepaMehta REST API.

The switch to DeepaMehta brought a lot of complexity to the project and I’m personally not happy my results at this point because I had trouble to spend enough time for the project. Additionally some basic problems like dealing with changing schema and data import are not really solved well at this point. The data is in DeepaMehta and can be queries with the included client but it’s not in a state where it’s usable for the community.

Overall the GSoC was an interesting experience for me. Through I’ve failed to set aside enough time for the tasks. The timely overlap with university lectures does not make it easier. So I can only recommend to know beforehand that you’ll have enough time to accomplish your goals. But the support from the Freifunk community was always great and helpful! As the project is not a state that can be considered ‘ready’ I’m continuing working on it.
 
Future Plans

I definitly want to finish the work at least to point where it can be used by the wider Freifunk community.

The default DeepaMehta client isn’t designated to query a lot of fields like our API provides. Here we need a new web based client to provide users an interface to select fields and get a proper response.

Work will continue on integrating the API data and DeepaMehta.

Repository: https://github.com/freifunk/query.api.freifunk.net

GSoC: Freifunk API Query Client meets DeepaMehta

This post will give an overview about the ongoing work on the API query client GSoC project. As I’ve wrote a few days ago I met Jürgen Neumann at the WCW 2014 and he introduced me to DeepaMehta. We decided to use this tool as a database for the API data. This approach is quite a leap from my original proposal and idea but after a few discussions we realized there are a lot of benefits to this approach. Here I want to give a short overview about this new approach.

What is DeepaMehta?

DeepaMehta represents information contexts as a network of relationships. This graphical representation exploits the cognitive benefits of mind maps and concept maps. Visual maps — in DeepaMehta called Topic Maps — support the user’s process of thinking, learning, remembering and generating ideas. We think that working with DeepaMehta stimulates creativity and increases productivity. Welcome to DeepaMehta

This sounds interesting but one may ask where is the connection to community data in machine readable form? The answer is in the data model. Here is an example from the website:

The data is organised in a topic map. There are topics that can represent e.g an organisation or a person or an event. These topics are associated through a hypergraph relationship. This means that it is possible to model all kinds of possible relationships between topics. For example a person can be modelled as a topic, that is associated with an address and the adress consists of location data, email adresses and so on… this person can be part of several organisations and these organisations can be aggrated by several parent organisations and so on…

We have a powerful graph to represent all kinds of information and we know about the relationships of each information-bit to other bits…

We have a graph that we can traverse for queries. It’s straight forward to e.g. list all organisations, that a person is a member of.

To take an example from the API data: We want to know which communities use “olsr” as their routing protocol . This would be an instance of the topic type “routing protocol”. We now only need to follow the links to all instances of the type e.g. “freifunk-community” that are connected to the “oslr” instance of the “routing protocol” topic.

This would allow for flexible queries. Another example would be a map where all instances of location topics are displayed and their parent-topics are included as label for the points on the map. If e.g. node data is present for communities this would allow for a global node map that shows not only node locoations but also community event locations and meeting places. Of course there is a huge amount work do before this will be working but overall I hope this explains why there are a lot of benefits for using this representation.

Freifunk API data in DeepaMehta

So how will it work? I’ve tried to put it in a diagram:

More details:

Data

At the moment there is a specification – a JSON schema file – for the API data and an instance – a JSON API file.

Magic

Magic is probably the wrong word for that, but all the hard work is done by DeepaMehta and I only build a plugin on top of that – unaware of the implementation details – so I thought it is appropriate. To quote Arthur C. Clarke: Any sufficiently advanced technology is indistinguishable from magic

At first, we need to put the schema into the DeepaMehta platform. This is possible using a plugin that creates the topics in DeepaMehta for the entries in the JSON schema.

The next step is to feed the current data into DeepaMehta. This creates instances of the previously defined topic types. E.g. a topic for each community.

Once the data is in DeepaMetha it’s possible to query that data.

Presentation

We can now speak JSON over HTTP using REST with the platform and present the results in various ways. E.g. display communities in a map or provide an text interface to query the data. DeepaMehta already provides an REST API and a web-based interface for exploring and editing topic maps but while testing and playing with the interface we found it too complicated.

Current Progress

Feeding the data by hand into the plattform is not practical and I’m working on an import script for the schema and the API data. At the moment mapping the basic JSON types (string, integer, ..) into DeepaMehta is working but more works need to be done to get a better representation. Once the data is in the focus will shift on a doing actual queries.

Problems

Complexity. These are for the most part new concepts for me and I had little prior experience with semantic web technologies. DeepaMehta also covers quite a few other usecase and I need to learn more about the system.

Open Questions

Doing the actualy queries and traversing the graph is something I need to find a workable solution for. There are also different specifications of the API schema and different communities use different versions of the API. At the moment I’m ignoring that detail but here I need to find a solution. Another nice to have feature would be access to historic data. 

Lots of interesting problems.. unfortunatly I’ve been short on time in the past days and I’m quite behind the shedule but I’m optimistic that this approach is flexible enough to provide a solid ground all kinds of experiments with data. Once queries are possible things will hopefully move forward at a faster and more visible pace.