BYOD+FYOR: Semantic Hypermedia in Sports tracking in bandwidth constrained environments

Kjetil Kjernsmo, University of Oslo

Abstract

We propose a project to enable mobile units users to follow a video stream of their choosing. The video sources are distributed along a sports course by event organizers at certain positions, and the positions are available in hypermedia resources. We propose a system to allow configuration of which video streams are transmitted based on event organizer's input as well as users with mobile units.

Introduction and Background

Motivation from orienteering

The sport of orienteering is mostly practices as running in forests with a detailed map, where the competitors visit a number of control points in a pre-specified order, but they are free to pursue any route between the control points. Although it is sizable in terms of participation in the Nordic countries (where the largest events exceed 20000 participants), it has not until recently seen much media attention.

Orienteers have themselves put considerable effort into producing attractive media content for broadcast, an effort that has seen some success. In addition to video content, the presumed best athletes are given a tracking device consisting of a GPS and a GPRS modem. This will transmit their position to an Internet server, where a Java or JavaScript client can be download so that their position is displayed on the competition map. In addition, it has become possible to display these productions on big screens on the largest competition venues. The economic benefits from increased exposure has yet to materialize, however, and so it has become harder to justify the costs of the big screens for the vast majority of events.

It is in this context that we propose a different direction: Instead of a big screen, we would like to encourage interested spectators (out of which other orienteers tend to be the largest group), to bring their own devices (thus BYOD, an common acronym for Bring Your Own Device).

Moreover, we are of the impression that most spectators have a personal relationship, as family, friends or club mates, or close competitors, of the athletes themselves, and so, it is not only the leader that is interesting, but the athlete that they know personally. This is a radical departure from the contemporary narrative of sports broadcasts, where the fight for the lead and eventual victory is the main and often sole focus. We call this approach Follow Your Own Runner, or FYOR.

Finally, we note that the video production may, and has in the past, compromised the athletic quality of the competition. This should be avoided as much as possible.

Technical challenges

Both the BYOD and the FYOR aspects bring considerable practical challenges. There are two ways of providing bandwidth to mobile units, Wifi or mobile networks. While mobile networks are nearly ubiquitous, the latest generation, LTE is not very widespread as of this writing, and even if it were, a large number of devices within a small area is likely to saturate the bandwidth quickly.

Wifi, or rather the 802.11 series of IEEE standards can be deployed on an ad hoc basis in competition venues to provide sufficient bandwidth to an audience of a few hundred, a limit that is sufficient for most orienteering events. There are many practical problems to be addressed, but for the scope of this paper, we only note that although Wifi broadcasts are possible in a lab environment with known devices, it is unlikely to work for this practical purpose. This leaves us with the unicast option, but this is needed to support the FYOR aspect, so it is hardly a concern.

The FYOR aspect also brings technical challenges in a bandwidth constrained environment. A single standard definition video stream requires around 2 Mbit/s bandwidth. Often, orienteering competition venues are in relatively remote locations, and so, the available bandwidth into the venue may not exceed 10 Mbit/s, or even less. In the forest, one may use cameras connected to LTE modems to stream to the Internet, making sure to place them where the LTE signal is strong. Even though a handful of cameras can be operated this way, it is in the above case not possible to stream more than 4-5 video streams into the venue through the link. If LTE is also used into the venue, it is also likely to consume much of the available bandwidth.

Recently, it has also become possible to buy relatively inexpensive radio link equipment to stream video point to point in the 5.8 GHz band. Nevertheless, this places further constraints on the placement of cameras, as well as regulatory constraints, and even though it is an option, in many cases, using the LTE infrastructure makes it possible to set up cameras on an ad hoc basis.

This motivates the constraint that cameras should not transmit unless the streams are known to be watched. Moreover, the total number of streams that can be transmitted must be configurable by the organizer's operator, so that the most important traffic is allowed to be transmitted.

Scenario

The above background and constraints prompts a more detailed scenario development, first we start by identifying different persona involved:

Persona

Athlete
Person participating in the present competition. May be interesting for others to follow. The athlete must be prevented from watching the cast before their own start as they are not allowed to acquire knowledge about the course the will run in the event.
Spectator
Person attending the event without participating in the sport. Doesn't have any deep understanding of the intricacies of orienteering technique. May or may not have any relationship with any of the athletes.
Participant
Person attending the event who has good knowledge about orienteering as a sport, having participated in it themselves, but is not participating in the present competition. Is likely to have a personal relationship with one or more athletes.
Operator
Person designated by the organizer to perform administrative tasks on the system.
Producer
Person designated by the organizer to produce a broadcast based on available video streams.
Course-setter
Person appointed to set the courses, i.e. determine where the control points will be, to make the competition fair and interesting. Their primary concern is the quality for the athletes, but they may have to compromise for the video production.
Commentator
Person designated by the organizer or the media to provide commentary on the athlete's performance. Will be well versed in the sport.
TV watcher
A spectator or participant watching the production without being present at the venue.
Audience
The participants, spectators, and producer (the latter acting on behalf of the commentator and the TV Watchers) collectively.

Description of the race

To understand what could be interesting to watch, it is important to understand the progress of the race. The race may start in a mass start, where all athletes start simultaneously, a chase start, where they start based on a prologue (in both case, the first to finish wins), or an interval start.

At the start, the athletes is handed the map with the course at the moment of start. They generally start reading and running simultaneously. At some point, their route choices will diverge. Some route choices are significantly faster than others, but this is not known to the athletes or anyone else. Commentators and participants are likely to speculate based on their expert opinion, and much of the interest is generated by finding out if they are right. Moreover, it may be that different strengths of different athletes result in that there isn't a single best choice.

At some point during the course, athletes may "miss". A miss occurs when the athlete looses time relative to route choice the athlete has made. A miss may occur en route but commonly occur near the control points. Misses at the level of the best orienteers usually occur because of cognitive overload due to fatigue, excessive speed (the athletes usually run at a pace very close to the maximum ability of a very well trained endurance athlete), very complex patterns in the terrain, etc. However, it is not necessarily clear when the athlete is making the mistake. Spectators cannot be expected to understand in many cases, while participants and commentators may speculate that a mistake has happened as the athlete is following a parallel track to what they intended. However, since a parallel track may be indistinguishable from a route choice, it can take several minutes from the actual mistake was made to the point where it results in a clear miss.

In general, time can be lost by not running fast enough, bad route choices, and misses. Hesitation and spending excessive time on the control points may also be factors.

In some events, the athletes will pass through the venue area, to be readily visible to everyone present.

After visiting all control points in the specified order, the athletes will return to the finish line. From the last control point to the finish line, it is usually just a few hundred meters. Crossing the finish line is a culmination of the race drama in the sense that the winner is determined there, however, there are parts of the race that are more interesting for the overall narrative.

Typically, moments where route choices are made, or misses are made are decisive for the outcome of the race, and so, it is very interesting to have cameras where these moments are likely to happen in space. To determine where these are requires extensive collaboration between the producers and the course-setters.

Camera selection

Let us assume that cameras have been placed along the course in interesting spots. They will start streaming video if something interesting happens, the meaning of interesting will be clear later. As noted previously, they cannot all stream all the time, and so our first concern is the overall system health, i.e. the organizers must ensure that the system performs so that a production of at least one video stream for participants and spectators can be made by the producer and streamed to their devices.

Overall system health

Assuming there's a larger number of cameras than can be supported by the available bandwidth (which is likely to be the case), an operator must be tasked with limiting the number of cameras streaming at any given time to avoid saturating the data transfer link(s).

Despite of FYOR, the first priority must be to be able to produce one video stream that is interesting to most participants and spectators. This is likely to be a broadcast type production with a classical narrative. The producer is the person tasked with determining what will be interesting to the audience. The commentator is tasked with providing commentary on this production.

We envision a voting system, where the producer as well as the spectators and participants who have a device, will have votes, but the producer will be assigned more votes than the rest of the audience. The software system will then open the stream from the highest voted cameras that are ready to stream. The cameras can be designed to themselves find out if they are ready to stream by using e.g. motion detection. The producer may also have the option to turn a certain camera on and override the motion detection and the voting system.

If the number of cameras with votes exceed the safe system limit, and a camera that has more votes than a currently streaming camera becomes ready, the complexity rises because the system must decide if it should shut down an already streaming camera. As motion detection comes with some uncertainty, it makes sense to raise the confidence limit to let it stream, and also introduce a delay, in case the event on the new camera is short-lived.

Assignment of votes

The audience may assign votes manually by interacting with a map that shows the course as well as the positions of the cameras. We envision a GUI where the producer or operator can set a camera in an on, off, or auto state. In the on position, the camera will stream, irrespective of votes or motion detection. In the off position, it will likewise not stream (this may be an situation where the operator may need to interfere for the overall system health). If a camera is in an auto position, any member of the audience may assign votes by selecting the camera in the UI. The producer will have a high number of votes, while participants and spectators will assign their votes inversely proportional to the number of cameras they select.

This creates an opportunity for computer assisted assignment of votes. Typically, the GUI will let the audience select the runners they are interested in, and so assignment may happen based on how sure the system is that an interesting athletes is nearing that camera. For the producer, interesting athletes are those fighting for the lead, or for athletes gaining on the lead. For participants, it may be club mates or close competitors. If the athlete is tracked by GPS, the GPS position could be used, and if not, the likely arrival time can be extrapolated based on past performance.

Protocols

Details of the video streaming protocol will have to be worked out later, what is important to note is that the most common protocol for this purpose, the Real Time Streaming Protocol (RTSP) requires a client to initiate streaming. It therefore makes sense that the camera just reports if it is ready to play and if something interesting may be recorded, it is up to the viewers to decide whether to actually stream if they are authorized. The streaming itself is likely to happen through a proxy in the LAN.

This problem appears to lend itself very well to RESTful protocols. The term RESTful is frequently abused to mean any HTTP-based protocol where a resource has a URI, that is only a small part of the point of REST. Critical to REST is that the messages contains everything needed to drive the interaction. Thus, a detailed description is superfluous, indeed, a detailed description is a sign that REST constraints are not understood, as it then relies on external documentation rather than messages are needed. Suffice to say that we rely on hypermedia RDF.

Camera interaction

For the camera control protocol, each camera acts as a client in a client-server architecture and submits some RDF to a central server controlled by an operator on the venue, where the subject URI tells the central server where to find the video stream if it needs it; if it is authorized to start playing, pausing, etc.; some information about the readiness to stream, containing information from the motion detector, etc.

Based on this information, the server may then decide to instruct the camera to start streaming. When responding to the RDF submission, the response may not contain any content, but it should indicate to camera how often it should send its updates. If the camera has very few votes, the server could instruct the camera to send updates infrequently, e.g. every 10 seconds, whereas if a camera has many votes and the server is just awaiting motion detection in the field of view, it should update more frequently, possibly as fast as possible.

HTTP Example

An HTTP implementation of the above is possible, the below is a simple example, disregarding complexities that may arise from the fact that each event possibly should carry an explicit timestamp also identified by a URI.

PUT /cam HTTP/1.1
Host: venue.orienteering.org
Content-Type: text/turtle
Date: Fri, 24 Jan 2014 23:13:03 +0100

@prefix hm: <http://example.org/hypermedia#> .
@prefix disco: <http://rdf-vocabulary.ddialliance.org/discovery#> .
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix dcmit: <http://purl.org/dc/dcmitype/> .

<rtsp://camhost1.orienteering.org/stream> a dcmit:MovingImage ;
  geo:lat 59.97061 ;
  geo:long 10.64216 ;
  hm:can hm:play, hm:pause, hm:stop ;
  disco:standardDeviation 124.12 ;
  disco:mean 1050.21 .


204 No Content
Date: Fri, 24 Jan 2014 23:13:04 +0100
Expires: Fri, 24 Jan 2014 23:13:06 +0100

The above examples tells the server the camera's position, that it may start and stop (defining the semantics of the operations in the object such that they can be acted upon is a topic for further research) the remote video stream (being authorized in a preceding request), and it assumes a simple motion detection algorithm where the average is seen to change beyond what is expected from the standard deviation. Then, the response tells the client that it may/should update the server with new information after two seconds.

User interaction

The audience and the operators need to interact with a system to set priorities for cameras and to watch video streams. For this, the user agents need to know the position of the cameras, the status, i.e. whether it is set to on or off by an operator or producer, or auto, where the audience may vote. It should also have the total number of votes for a certain camera, so that the audience may see if their votes may be better used elsewhere. Finally, hypermedia triples should be added for logged-in users, to tell the client how to cast votes. This can be done like in the following example:

HTTP Example

GET /cams/ HTTP/1.1
Host: venue.orienteering.org

200 OK
Date: Fri, 24 Jan 2014 23:16:13 +0100
Content-Type: text/turtle

@prefix rev: <http://purl.org/stuff/rev#> .
@prefix hm: <http://example.org/hypermedia#> .
@prefix hma: <http://voting.orienteering.org/hypermedia-application-specific#> .
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix dcmit: <http://purl.org/dc/dcmitype/> .

<rtsp://camhost1.orienteering.org/stream> a dcmit:MovingImage ;
  geo:lat 59.97061 ;
  geo:long 10.64216 ;
  ex:status ex:on .

<rtsp://camhost2.orienteering.org/stream> a dcmit:MovingImage ;
  geo:lat 59.9503 ;
  geo:long 10.7214 ;
  ex:status ex:auto ;
  hm:canBe hma:votedFor ;
  rev:hasReview </user/foo/vote/1>, </user/bar/vote/56> .

</user/foo/vote/1> rev:rating 1.2 ; 
                   rev:rewiever </user/foo> ;
		   hm:canBe hm:deleted, hm:replaced .

</user/bar/vote/56> rev:rating 5.4 ; 
                    rev:rewiever </user/bar> . 

In the above example, it is assumed that </user/foo> is authenticated and authorized to edit their votes as well as to cast votes for a certain camera.

To vote, the client would first dereference hma:votedFor to obtain something like (omitting the HTTP dialog and prefixes for brevity):

hma:votedFor rdfs:comment "To vote for a resource, add review to the
                           camera with rating."@en ;
	      hm:httpMethod "PUT" ;
	      hm:collection </user/foo/vote/> ;
	      rdfs:seeAlso [ 
	        rdfs:label "Review ontology" ;
	        rdfs:isDefinedBy <http://purl.org/stuff/rev#> . 
              ] .

Then, it should follow the directions to add their votes, e.g. with HTTP:

HTTP Example

PUT /user/foo/vote/2 HTTP/1.1
Host: venue.orienteering.org

@prefix rev: <http://purl.org/stuff/rev#> .

<rtsp://camhost1.orienteering.org/stream> 
  rev:hasReview <> .

<> rev:rating 4 ; 
      rev:reviewer </user/foo> ;

204 No Content

The relative URI <> will resolve to the Request-URI, i.e. http://venue.orienteering.org/user/foo/vote/2. Thus, this will add another vote to the first camera. For the user experience, the user agent should check that the user does not exceed its allowed number of votes, and incorporate that in the user interface. If malicious users are feared, it should also be verified on the server side. The total number of votes per user can be part of the user profile, not shown here.

Implementation

With the protocol established for communications, it is time to examine some details in the implementation on the server and in the clients.

Server

To find which cameras will be streamed, the server will run two SPARQL queries against its own database:


SELECT ?camera WHERE {
  ?camera ex:status ex:on .
} LIMIT $MAXCAMS

where $MAXCAMS is not part of the SPARQL language but a configuration parameter set by an operator to the maximum number of cameras that can be streamed simultaneously.

When this query has returned and the number of solutions (hereafter denoted $OPCAMS) is smaller than $MAXCAMS, the following query will be executed:

 
PREFIX rev: <http://purl.org/stuff/rev#> .
SELECT ?camera WHERE {
  ?camera ex:status ex:auto ;
          disco:standardDeviation ?stddev ;
          disco:mean ?avg ;
          rev:hasReview ?vote .
	  FILTER ( ?avg > $RUNNINGAVG + $THRESHOLD * ?stddev)
  ?vote rev:rating ?rating .
} 
GROUP BY ?camera
ORDER BY sum(?rating)
HAVING(sum(?rating) > 0)
LIMIT $MAXCAMS - $OPCAMS

where $RUNNINGAVG is a variable kept by the application that is the running average of observations that happen when a camera is not filming, and $THRESHOLD is a configuration variable to set the sensitivity of the camera. (Queries have not as of yet been tested)

These queries will yield a list of cameras that at any given time shall stream. The application will hold a list of playing cameras so that cameras that are added to the list will receive a play request and cameras that fall out of the list will cause a stop request to be sent. Some extra work must be done to ensure that cameras are not switching too fast to changing circumstances. A proxy will be set up so that the clients that are held by the audience can access the streams.

User Agents

The User Agents will run the gpsseuranta.net light client, which provides the map for the event and the GPS tracks. Onto this map, the cameras will be projected. This application also provides a list of GPS-equipped runners, with the possibility to select interesting runners. If this data is accessible to the camera layer of the system, this could be used to distribute votes automatically.

We also noted above that athletes without a GPS may also be interesting, and the implementation may also need to take this into account.

Open problems

The current application is intended for rapid deployment, and so, cannot rely on open research problems to be solved. Nevertheless, it does point out some open problems.

Reasoning on actions

It would be interesting if applications could be driven around a small number of standardized primitives, e.g. CRUD-operations, and reasoning could be applied to find out what a certain high-level instruction, e.g. "vote" means. In the above example, it is sufficient in-band information for a programmer to implement the voting system (of course, in a real implementation, the inline documentation should be more verbose), but it is insufficient information for a machine to itself create in required functionality. Reasoning and bounded homomorphisms is worth looking into for this purpose.

Cascading updates

As can be seen in the above example, deleting all triples with the subject </user/foo/vote/1> will not delete the cameras linking to it. This is not a problem for the application in this scenario, since it counts the votes, but it is a possible problem in other application scenarios and there should be a general solution, possibly cascading updates through the graph.