Blog Category: Augmented reality

Augmented Reality: Is this time different?

on Comments (0)

Ivan Sutherland’s Sword of Damocles, a head-mounted virtual and augmented reality system, was ungainly but remarkably forward-thinking. Developed over a half-century ago, the demonstration in the video below includes many of the components that we recognize today as critical to VR and AR displays, including the ability to display graphics via a headset, a positioning system, and an external computational mechanism.

Since then, AR and VR have experienced waves of hype that builds over a few years but reliably fades in disappointment. With the current excitement over consumer-level AR libraries (such as ARKit and ARCore), it is worth asking if anything is different this time.

The Augmented Connected Enterprise (ACE) team at FXPAL is betting that it is. We are currently building an AR-based remote assistance framework that combines several of our augmented reality, knowledge capture, and teleconferencing technologies. A future post will describe the engineering details of our work in more detail. Here we explore some of the problems that AR has faced in the past, and how we plan to address them.

In their paper “Drivers and Bottlenecks in the Adoption of Augmented Reality Applications” [1], Martinez et al. explored some typical pitfalls for AR technology, including No standard and little flexibility, Limited (mobile device) computational power, (Localization) inaccuracy, Social acceptance, and Amount of information (Distraction). We address each of these in turn below:

  • No standard and little flexibility
  • Limited (mobile device) computational power

Advances in contemporary technologies have largely addressed these two issues. As mentioned above, the market appears to be coalescing into two or three widely adopted libraries (specifically ARKit, ARCore, and Unity). Furthermore, limited computational power on mobile devices is a rapidly receding concern.

  • (Localization) inaccuracy

Caudell and Mizell echoed this issue in their paper introducing the term, “augmented reality” [2]. They wrote that, “position sensing technology is the ultimate limitation of AR, controlling the range and accuracy of possible applications.”

Addressing this concern involves scanning several real world objects in order to detect and track them in an AR scene. Our experiences so far reveal that, even if they aren’t yet ready for wide deployment, detection and tracking technologies have come a long way. The video below shows our procedure for scanning a 3D object with ARKit (adapted from this approach). We have found that ensuring a flat background is paramount to generating an object free of noisy background feature points. Other than that, the process is straightforward.

Scanning an object in this way generates a digital signature that our app can recognize quickly and accurately, allowing us to augment the physical object with interactive guides.

  • Social acceptance

The many issues associated with the launch of Google Glass made it clear that HMD devices are not yet acceptable to the consumer market. But our intuition is that focusing on the consumer market is inappropriate, at least initially, and that developers should instead target industrial settings (as Caudell and Mizell did at Boeing). A more appropriate metaphor for AR and VR devices (outside of their use in gaming) is a hard hat—something that you put on when you need to complete a task.

  • Amount of information (Distraction)

Martinez et al. are concerned that the “amount of information to be displayed in the augmented view may exceed the needs of the user.” This strikes us less as a bottleneck and more a design guideline—take care to make AR objects as unobtrusive as possible.

In addition to the issues above, we think there are at least two other problems standing in the way of widespread AR adoption:

  • Authoring

There are a variety of apps that can help AR content creators author scenes manually, including Amazon Sumerian, Apple Reality Composer, Adobe Aero, and ScopeAR WorkLink. However, with these tools designers still must create, import, place, and orient models, as well as organize scenes temporally. We think there are opportunities to simplify this process with automation.

  • Value

Finally, as with any technology, users will not adopt AR unless it provides value in return for their investments in time and money. Luckily, AR technologies, specifically those involving remote assistance, enjoy a clear value proposition: reduced costs and time wasted due to travel. This is why we believe the current wave of interest in AR technologies may be different. Previous advances in the quality of HMDs and tracking technologies were not met with similar increases in teleconfercing technologies and infrastructure. Now, however, robust, full media teleconferencing technologies are commonplace, making remote AR sessions more feasible.

Many tools already take advantage of a combination of AR and teleconferencing technologies. However, to truly stand in for an in-person visit, tele-work tools must facilitate a wide range of guided interaction. Experts feel they must travel to sites because they need to diagnose problems rapidly, change their point-of-view with ease to adapt to each particular situation, and experiment or interact with problems dynamically. This type of fluid action is difficult to achieve remotely when relaying commands through a local agent. In a future post, we will discuss methods we are developing to make this interaction as seamless as possible, as well as approaches for automated authoring. Stay tuned!

[1] T. P. Caudell and D. W. Mizell. “Augmented reality: An application of
heads-up display technology to manual manufacturing processes”. In
Proc. Hawaii Int’l Conf. on Systems Sciences, 659–669, 1992.

[2] Martínez, H. et al. “Drivers and Bottlenecks in the Adoption of Augmented Reality Applications”. Journal of Multimedia Theory and Application, Volume 1, 27-44, 2014.

FXPAL@ACM ISS 2019, November 10

on

FXPAL is presenting a Demo and a Poster at ISS 2019 in Daejeon, South Korea.

Demo

We propose a tabletop system with two channels that integrates document capture with a 4K video camera and hand tracking with a webcam, in which the document image and hand skeleton data are transmitted at different rates and handled by a lightweight Web browser client at remote sites.

Toward Long Distance Tabletop Hand-Document Telepresence
Chelhwon KimPatrick Chiu, Joseph de la Pena, Laurent Denoue, Jun Shingu, Yulius Tjahjadi

Poster

We present a remote assistance system that enables a remotely located expert to provide guidance using hand gestures to a customer who performs a physical task in a different location.

A Web-Based Remote Assistance System with Gravity-Aware 3D Hand Gesture Visualization
Chelhwon KimPatrick ChiuYulius Tjahjadi

Come by and check them out!

Nudging the world toward better pictures and video

on Comments (2)

An excellent article on FXPAL’s NudgeCam application recently appeared in MIT’s Technology Review. NudgeCam encapsulates standard video capture heuristics, such as how to frame a face and good brightness characteristics, in order to provide guidance to users as they are taking video, using image analysis techniques such as face recognition,  as to how to adjust the camera to improve the video capture.

For its size, FXPAL has surprising breadth and variety of expertise. The NudgeCam work resulted from a collaboration between Scott Carter, whose expertise is in mobile and ubiquitous computing,  and John Doherty, our multimedia specialist, who knows all the standard video capture heuristics and many more. John Adcock brought image analysis techniques to the team, and 2009 FXPAL summer intern Stacy Branham contributed her human-computer interaction expertise.

A different application, also developed at FXPAL, supports rephotography in an industrial setting. Rephotography is the art of taking a photograph from the same location and angle as a previous photograph. Continue Reading

pCubee: a interactive cubic display

on Comments (2)

Our friend Takashi Matsumoto, (who built the Post-Bit system with us here at FXPAL) built a cubic display called Z-agon with colleagues at the Keio Media Design Laboratory. Takashi points us at this video of a very nicely realized cubic display (well, five-sided, but still). It’s called pCubee: a Perspective-Corrected Handheld Cubic Display and it comes from the Human Communications Technology Lab at the University of British Columbia. Some of you may have seen a version of this demoed at ACM Multimedia 2009; it will also be at CHI 2010. Longer and more detailed video is here.

Generating 3D models from webcams

on Comments (2)

One highly inconvenient thing about working with virtual worlds or 3D content in general is: where do your 3D models come from (especially if you’re on a budget)? A talented but (inevitably) overworked 3D artist? An online catalog of variable quality and cost? Messing around yourself with tools like SketchUp or Blender? What if you want something very specific, very quickly? The MIR (Mixed and Immersive Realities) team here at FXPAL is very interested in these questions and has done some work in this area. Others are working on it too: here’s an elegant demo from Qi Pan at the University of Cambridge, showing the construction of a model with textures from a webcam image:

ARdevcamp: Augmented Reality unconference Dec. 5 in Mountain View, New York, Sydney…

on Comments (1)

We’re looking forward to participating in ARdevcamp the first weekend in December. It’s being organized in part by Damon Hernandez of the Web3D Consortium, Gene Becker of Lightning Labs, and Mike Liebhold of the Institute for the Future (among others – it’s an unconference, so come help organize!) So far, there are ~60 people signed up; I’m not sure what capacity will be, but I’d sign up soon if you’re interested. You can add your name on the interest list here.

From the wiki:

The first Augmented Reality Development Camp (AR DevCamp) will be held in the SF Bay Area December 5, 2009.

After nearly 20 years in the research labs, Augmented Reality is taking shape as one of the next major waves of Internet innovation, overlaying and infusing the physical world with digital media, information and experiences. We believe AR must be fundamentally open, interoperable, extensible, and accessible to all, so that it can create the kinds of opportunities for expressiveness, communication, business and social good that we enjoy on the web and Internet today. As one step toward this goal of an Open AR web, we are organizing AR DevCamp 1.0, a full day of technical sessions and hacking opportunities in an open format, unconference style.

AR DevCamp: a gathering of the mobile AR, 3D graphics and geospatial web tribes; an unconference:
# Timing: December 5th, 2009
# Location: Hacker Dojo in Mountain View, CA

Looks like there will be some simultaneous ARdevcamp events elsewhere as well – New York and Manchester events are confirmed; Sydney, Seoul, Brisbane, and New Zealand events possible but unconfirmed.