Wed, Aug 19, 2009
Today, I was very excited when Thomas Wrobel sent me a draft of, “Everything Everywhere: A proposal for an Augmented Reality Network system based on existing protocols and infrastructure.”
Thomas has kindly agreed to let me publish his draft, to open a discussion on this topic. The diagram opening this post (click image to enlarge) shows, “An example of how collaborative 3D-spaces could be shared over existing IRC networks.” It is from Thomas’ proposal. The full text of his paper is included later in this post.
“Can we try to avoid a browser war this time?”
Thomas notes in the closing remark to his paper:
“I am absolutely confident in my belief AR will become at least as important as the web has, and probably a lot more so. It will also face much the same hurdles and challenges getting established as that medium did. But, speaking as a web-developer, can we try to avoid a browser war this time?”
Thomas Wrobel has consistently posted insightful comments on how existing standards could be used for creating open augmented reality networks. But he expressed concern to me that his work and this paper not be overplayed:
“I’m hardly a leader, I’m just an amateur with a load of ideas on AR-related topics, some which might be useful, others might become unworkable. I don’t want anyone to get the impression this is how I think it has to, or should be done.”
I have brought/am bringing up this topic of using existing standards and infrastructure where possible for open augmented reality networks in all my interviews with members of the AR Consortium.
And I am finding agreement on a point that Robert Rice makes, “there is no perfect, ultimate solution *now*, but we have to do *something* to work from and refine/evolve.”
Thomas Wrobel makes what I consider some crucial opening suggestions. I take my hat off to him for thinking about this early, coming up with some clear, elegant, and practical ideas, and doing the work to articulate these ideas so others can participate in evolving them. Massive props for that, many times over.
Good ideas on standards at an early stage of a developing industry like augmented reality are like spring sunshine and April showers for new crops. No one knows what storms and pests the growing season will bring – but water and sunshine (open standards) are always a good start. And, personally, I can’t wait to see how this new industry unfolds (see Bruce Sterling’s Layar Conference awesome keynote : “At the Dawn of the Augmented Reality Industry.”)
Thomas Wrobel is:
“a web developer working for a small, brand-new company called Lost Again, which mostly works on ARGs (That is, the alternate reality games, not the augmented reality games, although there’s probably going to be big overlap there in the future). We developed two educational ARG games for the Netherlands with a company called res-nova.”
I have been following Alternate Reality Games through the amazing work of Elan Lee and Fourth Wall Studios. Like Thomas, I think the intersection of ARGs and augmented realities is going to be very interesting. Thomas wanted me to point out that the website for his company with Bertine van Hövell, http://www.lostagain.nl/, is just a placeholder for now.
“Probably be up fully within a week or two. And, “despite the logo, we aren’t an AR company [yet], or a travel firm. The logos supposed to represent being lost in our minds.”
Thomas has been thinking about the topic of an open augmented reality network for a while now. He is an artist also known as DarkFlame and his ARN network is included in this augmented reality concept for 2086 he did in 2006 (click on image below to enlarge).
Both Thomas and Mez Breeze made extensive and insightful comments on my last post, “Augmented Reality – Bigger Than the Web: Second Interview with Robert Rice.” And in particular they both picked up on something I am very interested in – the potential use of the Google Wave Web of protocols in creating open augmented reality networks.
Mez in her brilliant brainsplosion on social tesseracting takes on the very definition of information:
“Tish, when you ask Robert “…what is your approach to delivering a massively shared real time [augmented reality] experience that is like Wave not confined to a walled garden?” that’s an extremely relevant question + one that needs to be addressed while considering the entirety of the Reality-Virtual Continuum. I’ve recently finished a series of articles addressing this: the framework I’ve developed is termed “Social Tesseracting.”
I have recently begun exploring the Google Wave Web of Protocols which are nicely outlined in this post by J Aaron Farr which includes the very interesting diagram below (so more on Google Wave in another post).
But, as Thomas notes, while he demonstrates his ideas using IRC (Internet Relay Chat) they reach Beyond IRC:
“As mentioned before IRC has some drawbacks, which are due to its age or method of working. As such, future systems might yet prove better alternatives for a open AR network. One example of such a system is Google Wave. It shares many of the advantages of IRC (open, anyone can create a channel of data, different permission levels can be set and its free), while avoiding some critical restrictions. (The data can be persistent). I believe some of the ideas I’ve mentioned, and possibly even the proposed protocol string could be adapted for Google Wave or other future systems. I believe overall the principles are more important then any specific implementation to get to them”
Also Thomas pointed that while he uses markers to illustrate some of his examples, they are just a method for tracking. What he is presenting is going to be transparent to the methodology of registration/tracking.
Tish Shute: You mostly use marker based examples but there is no reason why the principles you are suggesting will not be just as relevant as we move more into using more sophisticated image recognition tools is there?
Thomas Wrobel: No reason whatsoever. I mostly choose familiar markers as something that could be used now, with a lot of coding library’s already established for them. I think for most future AR use, markers will go completely…especially outside. Either things will be done purely by gps, object recognition, or the (in the case of advertising) markers will look like normal posters.
However, I do think traditional markers might “cling on” as being used for non geographical specific stuff at home. After all, if you need some reference points for moving mesh’s about in real time…(say, when playing a board game with a friend on the other side of the world)….then there’s probably nothing that’s going to be more practical then some simple bits of paper or card.
– A proposal for an Augmented Reality Network system based on existing protocols and infrastructure.
by Thomas Wrobel
The following paper is my vision of a open AR Network and potential methods to implement it with existing technologies. Specifically I’ll be focusing on a potential for a global outdoor AR network, although the ideas aren’t limited to that.
Of course I call it “my” vision, but I’m obviously not the first to have many of these ideas. I have been influenced and inspired by many things…
[Some of Thomas Wrobel’s influences – watched and played. Images from Mitsuo Iso’s Denno Coil (Click to enlarge) top, below from the game « Metroid Prime, » and Terminator, and the last from Buffy the Vampire Slayer!]
The AR Network.
When I speak of a future AR Network, I mean one as universal and as standard as the internet. One where people can connect from any number of devices, and without additional downloads, experience the majority of the content.
Where people can just point their phone, webcam, or pair of AR glasses anywhere were a virtual object should be, and they will see it. The user experience is seamless, AR comes to them without them needing to “prepare” their device for it.
From this point forward, I will refer to this future AR Network simple as the “Arn”.
The Arn should be an inclusive, and open platform where any number of devices can connect to, and anyone can make and host their own location-specific models or data.
It should allow people to communicate both publicly and privately, and not have their vision constantly cluttered with things they don’t want to see.
There’s two old, existing paradigms that I think can help reach this goal when they are combined.
The Internet Relay Photoshop.
IRC, or Internet Relay Chat was a chat system designed by Jarkko Oikarinen in the late 80’s.
Its a system where people meet on “channels”, they can talk in groups, or privately. Channels can be read-only, or open to all to contribute to. There is no restriction to the number of people that can participate in a given discussion, or the number of channels that can be formed. All servers are interconnected and pass messages from user to user over the network.
To me, this relatively old internet technology is a great template, or even foundation, for how the Arn could operate. Rather then text being exchanged, it would be mesh data (or links to mesh data), but other then that much of the same principles could apply.
People could join channels of information to view or contribute. Families could leave messages to each other scribbled in mid-air on private channels. Strangers can watch AR games being played between people in parks. People going into a restaurant could see the comments from recent guests hovering by the menu items.
None of this would have to be called up specially, if they are on the right channel when it was broadcast, they will see it.
The IRC paradigm becomes particularly powerful when combined with another one common to many computer users; that of a “Layer” in an art program, such as Photoshop or Paint Shop Pro.
As most of us know, layers allow us to separate out different components of a piece of art while editing, either to focus our attention on one piece, or to make future editing easier.
Now what if we simply have each “channel” of information represented as a layer?
Click to enlarge image above.
Having channels corresponding to layers is an easy and intuitive way for the Arn to operate. The user can login and contribute data to any channel, like IRC as well as adjusting the desired opacity and visual range of each layer, like they would a layer in Photoshop.
In this way they can get a custom view of the world, both with shared and personal AR elements visible at the same time.
They would not have to switch between various overlays to their world view, as they could see many at the same time.
Persistence of Data
With IRC or IRC-like system to communicate the data sent is mostly temporary data…broadcast on the fly from user to user and device to device. Retained in the users local logs, but not “hosted” anywhere.
I think for the majority of day to day purpose’s this is not so much a drawback, but actually desired for AR. Most casual communication doesn’t need to be recorded permanently in 3D space and, indeed, if it was, the cost of running such a service would increase exponentially with users and with time. Not to mention, our visual view of the world would get very cluttered very quickly. Imagine what your monitor would be like if it kept a history of every window you have ever opened and their positions!
So for most cases AR space should be treated like a 3D monitor letting us display many pieces of data from remote and local sources, and even to share them with others, but not being, by default, a permanent record for it all.
Most data will be analogous to pixels on a display, and if kept in records its only on the clients devices, not on the network itself.
However, occasionally we do want 3d data analogous to a web-page, such as (in the example above), the map layer. Data here should be persistent and visible to all that have that layered turned on. I see no reason why hosting this data needs to use anything else but standard web-hosting with the (read only) #channel on the Arn merely providing a route to the data.
As the user logs onto the channel, the server, using a chat-bot, can send them a list of meshes with location data attached, and the Arn browser can simply pick the data to display that’s local to them. (Note 1: By doing it this way around, it allows some degree of anonymity to be possible, rather then the server knowing exactly where you are and feeding the specific correct string to you.)
We simply need to establish standards so this data can be pulled up and interpreted.
For instance, this standard could be as simple as a XML string pointing to a KML file on a server. This could then be then displayed in the users field of view at the co-ordinates specified.
In this way permanent data tied to locations, such as historical overlays or maps, could co-exist on the same protocol as temporary data such as mid-air chat’s or gaming related meshes.
There is also no reason why this shared-space/personal spaces based on channels of data has to be restricted to things given absolute co-ordinates.
(Different ways to access the same mesh)
It could work just as well with Markers and thus relative co-ordinates.
This would be mostly useful for indoor use, letting people logged onto a channel see the same meshes as everyone else on the markers. Thus allowing multi-player AR games, or AR games with observers very easily.
For example; games like Chess could be played between people with no additional code needed; You simply have a set of markers for only your own pieces, and as you move them the channel updates with the new positions, which are displayed in place in your opponents field of view.
This sort of game comes “free” with just having a generic system of shared space supporting markers.
It would also allow AR adverts down the street or in magazines to be viewed by simply logging onto the right AR channel
If markers are designed with URL data in them, this could even be a prompted or automatic process.
“There is visual data in this area on the following channel; #ABCD would you like to view this channel?”
Pros and Cons of using IRC or IRC-like systems
• Anyone can write a IRC interface software.
• Anyone can create new IRC channels without cost
• Channels can have read and write permissions set.
• Users can easily have multiple channels open at once.
• Already established with thousands of severs worldwide.
• 500-or-so character limit. 3D data must be linked too, not sent.
• Slow update rate. Lines of data can take a whole second or more to send.
• Non-persistent. Good for a 3d-view, not good for storage.
An example of how collaborative 3D-spaces could be shared over existing IRC networks;
Click on the image to enlarge
While in the long run I would hope for a dedicated AR network to be developed, with greater flexibility with persistence of data, there is a lot that can be done with the existing IRC system to implement the ideas mentioned above.
Below I will show an example of simple, crude, pseudo-protocol that could be fairly easily implemented to create shared AR spaces broadcast across IRC channels.
Its important to note, the goal here isn’t to exchange the mesh data itself on IRC, its to exchange links to the data.
Exchanging the mesh data directly within the 500 character IRC limit would be very hard, and liable to errors.
It’s also a waste of network bandwidth, as many people logged onto the channel might not have that object in their field of view, so their clients should not bother downloading it. (it should be up to the client browsers when to anticipate and cache mesh data).
Proposed Basic XML link exchange for AR;
As user creates or changes an object, the clients software posts a simple xml formatted string to
the IRC channel.
Anyone logged into that channel then sees that mesh displayed in the specified location.
This string could be formatted as follows;
This string allows other users client logged into the channel to automatically load the object from the URL and display it at the correct position in their field of view.
If the permissions are set to allow it, they could then move the object themselves, with the update being feeding back seamlessly to other users on the channel.
The objects posted are given an ID, which can be just the posters name, followed by a unique object number for that name. These unique ID’s would allow clients to track different instances of the same mesh, as well as making it easy to implement permissions. (if only the poster should be allowed to move this object, then the clients simply check if ID matches the user name posting the update. If its not, they can ignore it).
Next the objects need to be linked to a mesh.
The location of the objects mesh doesn’t have to be a fixed remotely-hosted url, it could be an IP address and port number of the user posting the mesh,hosted by the application posting the link to the channel.
The objects co-ordinates, likewise, need not be specified as absolute gps co-ordinates, but instead could refer to generic Marker.
Or relative to a marker;
Or relative to a default plane;
The AR Browsers could then handle the association between the Markers pattern and its Name.
This way the markers are reusable, they do need unique markers to be printed for every new bit of AR they want to look at.
Users could just keep a set of generic markers handy, which they could simply assign to be Marker1,Marker2 etc for any AR use. (Note 2: As mentioned above specific makers could also contain a default ID name and channel built into their data, letting the Arn browser simply prompt the user if they want to see the model even if they aren’t in the right channel. This set up would be most useful for paper and even billboard advertising.)
The Default location could be a settable region, or marker, on the clients browser that defines a playable/user-able area in the field of view. Mostly useful for home use, this could typical be a square region on a users desk.
So, in the chess-game example, the client of the person making the moves simply updates the position relative to the Default every time they move their marker (which is tied to a chess piece mesh).
Then the (non-owners) clients software could automatically display it relative to their Default plane. This would make games like Chess, Checkers, Go or any other game involving merely moving objects about automatically very intuitive and easy to set up.
So by having meshes settable to absolute gps, marker-relative, or default-relative locations, reduces the bother necessary to experience AR content quite considerably, and makes “non-geo-specific” AR applications and games trivial to implement.
Next is permissions.
Mesh-permissions would be a simple string saying who else can update the data, if anyone.
By default you could only update or move your own meshes. (identified by the ID of first posting). If you attempt to update anyone else’s, their clients would just ignore it.
Thus in a game of chess, you can only move your own pieces. If you attempted to move your opponents (by reassigning your own marker to their pieces Ids), the clients would just ignore that assignment. You’d only be fooling your own system.
Likewise, when pinning a message in mid-air for your friends to read, no one else can change that message without your permission, although copying it would be easy. (Note 3: It’s important to note this sort of object-specific permission system is in addition to the global-permissions, or “user-modes” it’s possible to set for the IRC channels and users as a whole.)
Finally, as object data could change within all sorts of time-scales, the easiest way to keep everyone logged in up to date is to just have a time-stamp of when each model was last updated.
This would not necessarily be the same as the XML string post date, because the models mesh might not be updated, but merely moved, and in such case the Arn browser shouldn’t redownload the mesh.
This sort of arrangement could be used as a standard today, and users wouldn’t have to constantly download special AR programs to view a single AR mesh.
In the long-term I would hope for more advanced methods to manipulate Arn-content online, analogous to Dom manipulation in web-pages. But for now, we should at least establish standard methods for devices to pull up meshes and overlay them in the correct position.
So, having a layered system could give the user a seamless blend of dynamic and static data with which to paint their world with.
I believe this is all relatively easy to achieve using modifications of existing web technology, combined with some basic graphics systems.
However, so far I have only talked about remote data.
What of programs originating on the device itself? This is, after all, how most AR software we have at the moment works.
I think, that just like the remote channels, local software should also be blended into the same list of layers. People shouldn’t have to “Alt+Tab” out of one view of the world, to see another.
They should be able to see both at once, if they wish.
For instance, if your playing a AR game, why shouldn’t your chat window be viewable at the same time?
If you have skinned your environment with a custom view of the world, why shouldn’t you also see mapping or restaurant recommendations?
So local data and remote data should be blended in the same view.
How can AR software – of which I hope, there will be thousands – seamlessly be expected to layer their graphics, not only with the real world, but with each other, and with online data too? Will games and software makers need to co-operate to allow their graphics to be integrated together with correct occlusion taken into account? A tall order, no?
I must confess though, my technology knowledge fails me here.
I can only guess special graphics drivers, or 3D APIs, will have to be developed to let programs share their 3D world with that of a Arn browser.
Maybe programmes should simply treat themselves as a local-sever which the browser can connect too, and let the Arn handle all the rendering itself (although I imagine many games designers would find this quite limiting).
So I leave it as an exercise to the readers to discuss and propose the best methods by which this vision of a layered world could be realised..
As mentioned before IRC has some drawbacks, which are due to its age or method of working.
As such, future systems might yet prove better alternatives for a open AR network.
One example of such a system is Google Wave.
It shares many of the advantages of IRC (open, anyone can create a channel of data, different permission levels can be set and its free), while avoiding some critical restrictions. (The data can be persistent).
I believe some of the ideas I’ve mentioned, and possibly even the proposed protocol string could be adapted for Google Wave or other future systems.
I believe overall the principles are more important then any specific implementation to get to them.
⁃ In order for AR to flourish the user shouldn’t need to download a separate application for each mesh they want to see.
⁃ Having url’s embedded into QRCoded markers which point to standard mesh files like dxf or kml would be a way to do this right now. The QR code would only have to be seen precisely in shot once, then its borders could be used like a standard marker.
⁃ An augmented view of the world needs to support visual multitasking, and having layers of information is the best way to do that.
⁃ Methods need to be devised to allow drastically different software to contribute to these layers, without restricting either the software’s rendering ability’s, or the users ability to pick and choose what layers of information he wants to see.
I am absolutely confident in my belief AR will become at least as important as the web has, and probably a lot more so. It will also face much the same hurdles and challenges getting established as that medium did.
But, speaking as a web-developer, can we try to avoid a browser war this time?
Everything Everywhere , draft.
by Thomas Wrobel
Darkflame a t gmail