This week i’m working on a new book on ‘mobile learning’. I’m sharing some sections as they are completed. Here, a section on augmented reality, exploring what it is and where it may take us. The final section that i am working on today is on potential application at a practical level.
When i walk out of the station, i see a sign for ‘Town Centre’. It’s perfectly placed, directing me immediately to turn right. Some time later, i am still walking down a road, but it’s becoming progressively more suburban. The shops have given way to houses, the pavements to tree lined avenues. Wherever i am, it’s clearly nice, but clearly not the town centre as promised.
The reality is that i am lost. What looked like a promising start has turned into a suburban stroll. I’m not worried because my phone can help me get back: activate the map feature and look for the town centre. If i double click, the compass is activated, showing me clearly which way to head. Simple and effective.
In this context, i have my eyes to look for road signs of clues as to where i am, and i have the map to help me navigate. Once i have found the centre, i can drop a pin and get it to call out directions, like a sat nav.
Augmented reality takes us one step further. Whilst on my stroll through town i am relying on two disjointed views of reality, a map and my eyes, augmented reality joins the two together. Typically this is done by providing an overlay to a video feed. In this context, i hold the phone up, the camera activates and overlays labels on the video. From where i am standing i can only see trees, houses and distant hills. The overlays on the video can augment this. They can show the street name, painted over the tarmac. They can show me where the station in, a sign hovering mid air at eye level. As i move the phone round, the signs stay static, panning across the screen. If i want to find my way back to the station, i just have to keep it in the field of view in front of me and walk that way.
Augmented reality lets you see through houses, not literally, but it can label what sits the other site. It can name the distant hills. Things that are close can be large, further away can be smaller and, if i desire, i can click on one of those things and get directions.
So augmented reality is what we can see, but with an overlay of information. Street signs and directions are simply the start of what is possible.
Take the Starwalk App for the iPad. The screen shows an overlay of the sky with names of stars and planets on it. As i move the iPad across my field of vision, the view pans and shows the correct names for what is behind it. Suddenly i am an expert. As a child, years were spent with my red Tasco telescope and a star chart, trying to identify which one the North star is, which constellation looks like a bear, what might be a planet and where the International Space Station lives. All of this information, once so elusive, once so uncertain, is now clear for me.
This application of augmented reality is empowering. It’s not just that it lets me know what the stars are, it’s that it lets me look at them with my niece or nephew and be able to share the stories. Our interaction is no longer the uncertainty of whether that’s a star or a planet that we are viewing, but rather being able to know with certainty and to share a whole range of associated information, because the other thing we can start to do with the technology is add meta data to the view.
Instead of just seeing the constellation, i can access relevant information about how far away it is, the types of stars and galaxies that i can see, but also the way it was understood in Greek mythology and how that differed from Roman views. I can see what it looks like from the Space Station, or through the Hubble telescope. My range of experience is broadened exponentially, all facilitated by the technology, and it requires me to do nothing more than to point the device at the sky.
The technology is still embryonic, but already the potential is clear. Navigation is an obvious application: travel to a new city and just point the camera to find where to go, but then point the camera at the Statue of Liberty and let the phone tell you what it’s called, when it was built and access some footage of it being constructed. We are used to devices being passive portals to information, interfaces to be interrogated, to be ‘searched’. We are not yet used to the proactive pushing of information in quite this way.
Increasingly we will see layers of information available, overlaid on the video feeds. People, places, history and stories, there is no limit to what can be called up, and once we link this with geolocation services, where the device knows where it is as well as being able to add layers of meta data to the view, we have a truly potent experience that can be applied to all sorts of learning situations.
One of the most important innovations that will impact on the development of augmented reality services is object recognition. We already have commercial applications of facial recognition: iPhone, Facebook, these services can identify you, literally pick you out of a crowd. What we don’t see so much of is object recognition. Skywalk isn’t literally looking at the sky and identifying stars, but rather it’s using the tilt switches and compass to know where i am pointing it. It’s not analysing the video feed, but rather it’s overlaying information from memory, choosing what is relevant from where it thinks i am pointing it.
Similarly, the view of the street showing where the station sits, projecting a hovering sign at eye level isn’t analysing what it can see, it’s just projecting onto the video based on where i am pointing the phone. It’s clever, but not using vision and recognition.
It’s very hard for a processor to know what a chair it. Why? Because chairs all look a little different, they look different from different angles and, worst of all, they generally look nothing like the drawing that a child makes of a chair. I mean, the general quality of ‘chairness’ is there. Four legs, a seat and a back, but do the legs have struts half way up? Do they have a curved runner at the bottom, joining front and back? Do they have a solid or slatted back? And worse, sledges have runners too, so how do you know it’s a chair and not a sledge?
Things that are easy for us to identify are very hard for computers to work out. Like the difference between a side table and a stool. They are both small, with a flat top and four legs. And, even worse, you can’t just say it’s a stool if someone sits on it, because you can sit on a coffee table too. In fact, if you want to be pedantic, a coffee table can be both a table and a stool at one and the same time.