Improving the Expressiveness of Touch Input

December 6th, 2013 by

Touch input is now the preferred input method on mobile devices such as smartphones or tablets. Touch is also gaining traction in the desktop segment and is also common for interaction with large table or wall-based displays. At present, the majority of touch displays can detect solely the touch location of a user input. Some capacitive touch screens can also report the contact area of a touch, but usually, no further information about individual touch inputs is available to developers of mobile applications.

It would, however, be beneficial to capture further properties of the user’s touch, for instance the finger’s rotation around the vertical axis (i.e., the axis orthogonal to the plane of the touch screen) as well as its tilt (see images above). Obtaining rotation and tilt information for a touch would allow for expressive localized input gestures as well as new types of on-screen widgets that make use of the additional local input degrees of freedom.

Having finger pose information together with touches adds additional local degrees of freedom of input for each touch location. This, for instance, allows the user interface designer to remap established multi-touch gestures such as pinch-to-zoom to other user interface functions or to free up screen space by allowing input (e.g., adjusting a slider value, scrolling a list, panning a map view, enlarging a picture) to be performed at a single touch location that usually need (multi-) touch gestures that require a significant amount of screen space. New graphical user interface widgets that make use of finger pose information, such as rolling context menus, hidden flaps or occlusion-aware widgets have also been suggested.

Our PointPose prototype performs finger pose estimation at the location of touch using a short-range depth sensor viewing the touch screen of a mobile device. We use the point cloud generated by the depth sensor for finger pose estimation. PointPose estimates the finger pose of a user touch by fitting a cylindrical model to the subset of the point that corresponds to the user’s finger. We use the spatial location of the user’s touch to seed the search for the subset of the point cloud representing the user’s finger.

One advantage of our approach is that it does not require complex external tracking hardware (as in related work), and external computation is unnecessary as the finger pose extraction algorithm is efficient enough to run directly on the mobile device. This makes PointPose ideal for prototyping and developing novel mobile user interfaces that use finger pose estimation.

Painted on a cathedral ceiling or it didn’t happen

October 2nd, 2013 by

My kids are home-schooled.  One of the many consequences is that they are sheltered from bureaucracy more than the average kid.

One of my teenagers is involved with a not-quite-local high school, because, well, why should the public school community be denied the joy of sharing in his perceived infallibility?  In order for me to volunteer to drive him and some classmates to an event, I needed to fill out a Form.  A teenager is not the best communication medium, but it only took a week of back-and-forth to determine that no electronic version existed, and to actually get the Form in to my hands.  His last text about it was “I have the forms”.   And indeed, when he handed it to me, he said, “You have to fill it out in Quadruplicate!”

The Form, of course, was a carbonless paper form, with a white, orange, pink and yellow sheet.   I replied, “It’s okay.  It’s like carbon paper.  You just fill out the top.”   His whiny cry of “But how do you Know?”, was less a doubting of my knowledge than a complaint that there was no “About this form” link at the bottom of the paper.  Though I still doubt it, he claimed to never have seen the like (remember that infallibility?).   (I also find it amusing that although Zingerman’s has gone digital and gotten rid of the carbonless ordering forms, they still say “Yellow copy” and “Pink copy” on the interim white receipts.)

A bit more questioning and discussion with my colleagues revealed that our kids really believe that there are only 3 generations of a technology:  What they and their peers use, what their parents use (now, not in their youth), and the original invention.

Thus text documents are either shared in the cloud, stored locally on a laptop/desktop, or painstakingly hand-duplicated by monastic scribes.  Personal music is streamed, parents listen to satellite radio and MP3s that came from old CDs, and people used to listen to rocks and sticks played around the communal cooking fire pit.   Vinyl LPs aren’t music at all.  As my kids said at a friend’s party a few years ago, “Why do you have those plastic things we make art bowls out of in your closet?”   We found a 20+ year old AAA Triptik for a cross-country drive and one of the kids asked how we updated that.  Might as well have been runes on dragon skin.

There are lots of other examples, and I’m resisting the urge to write about them.  But I’m thinking of all those intermediate technologies that are disappearing like so many 5 1/4″ floppies.

P.S. This post sat as a draft for about a year, and I’m only putting it out because I hear Gene’s voice asking me to put out content.  Which I intend to do.

 

Sorry for the down time and happy anniversary

September 19th, 2013 by

We moved into our new building about 2 years ago.  Long enough ago that we have quite a few energetic new employees that don’t know that we were ever anywhere else.   But the “new” place is nice, and getting better, and worthy of celebrating, at least in a little way.

I was thinking of bringing in donuts on Monday to celebrate, in order to follow one of Gene’s bagel rules:  If you want donuts, you have to get them yourself.   However, hard drives play by their own rules.

The FXPAL Blog is one of the few web servers we have that ran directly on server hardware, given that it started before “clouds”.   When the disk sneezed over the weekend, the site went down.  So I skipped the donut pickup to pick up the pieces of our blog.   We took this as an opportunity to virtualize and update the underlying infrastructure.   I expect there are a few plugins not-quite right, and the title bar is messed up – sorry, Tony.

Once I get it all right, I’ll bring the donuts.

 

 

Copying and Pasting from Video

September 11th, 2013 by

This week at the ACM Conference on Document Engineering, Laurent and Scott are presenting new work on direct manipulation of video.  The ShowHow project is our latest activity involving expository or “how to” video creation and use. While watching videos of this genre, it is helpful to create annotations that identify useful frames or shots using ShowHow’s annotation capability directly, or by creating a separate multimedia notes document.  The primary purpose of such annotation is for later reference, or incorporation into other videos or documents.  While browser history might be able to get you back to a specific video you watched previously, it won’t readily get you to a specific portion of much longer source video efficiently, or provide you with the broader context in which you found that portion of the video noteworthy.  ShowHow enables users to create rich annotations around expository video that optionally include image, audio, or text to preserve this contextual information.

For creating these annotations, copy and paste functionality from the source video is desirable.  This could be selecting a (sub)frame as an image or even selecting text shown in the video.  Also, we demonstrate capturing dynamic activity across frames in a simple animated GIF for easy copy and paste from video to the clipboard.  There are interaction design challenges here, and especially as more content is viewed on mobile/touch devices, direct manipulation provides a natural means for fine control of selection.

Under the hood, content analysis is required to identify events in the video to help drive the user interaction.  In this case, the analysis is implemented in javascript and runs in the browser on which the video is being played.  So efficient means of standard image analysis tools such as region segmentation, edge detection, and region tracking are required.  There’s a natural tradeoff between robustness and efficiency here that constrains the content processing techniques.

The interaction enabled by the system is probably best described in the video below:

Video Copy and Paste Demo

Go find Scott or Laurent in Florence or contact us for more information.

Remembering Gene

August 22nd, 2013 by

We are very sorry to report that Gene Golovchinsky passed away on August 15, 2013.

His friends have created an online remembrances forum at http://genegolovchinsky.blogspot.com.

Gene was the heart and soul of this blog, he wrote 3/4 of the posts, and it exists solely because he pushed this admin to create it and do occasional maintenance.  It cannot be the same without him, but I hope it will not stop.

Looking ahead

July 5th, 2013 by

It is reasonably well-known that people who examine search results often don’t go past the first few hits, perhaps stopping at the “fold” or at the end of the first page. It’s a habit we’ve acquired due to high-quality results to precision-oriented information needs. Google has trained us well.

But this habit may not always be useful when confronted with uncommon, recall-oriented, information needs. That is, when doing research. Looking only at the top few documents places too much trust in the ranking algorithm. In our SIGIR 2013 paper, we investigated what happens when a light-weight preview mechanism gives searchers a glimpse at the distribution of documents — new, re-retrieved but not seen, and seen — in the query they are about to execute.

Read the rest of this entry »

In Defense of the Skeuomorph, or Maybe Not…

June 10th, 2013 by

Hard drive iconJony Ive is a fantastic designer. As a rule, his vision for a device sets the trend for that entire class of devices. Apparently, Jony Ive hates skeuomorphic design elements. Skeuomorphs are those sometimes corny bits of realism some designers add to user interfaces. These design elements reference an application’s analog embodiment. Apple’s desktop and mobile interfaces are littered with them. Their notepad application looks like a notepad. Hell, the hard drive icon on my desktop is a very nice rendering of the hard drive that is actually in my desktop.

Read the rest of this entry »

Client-side search

April 23rd, 2013 by

When we rolled out the CHI 2013 previews site, we got a couple of requests for being able to search the site with keywords. Of course interfaces for search are one of my core research interests, so that request got me thinking. How could we do search on this site? The problem with the conventional approach to search is that it requires some server-side code to do the searching and to return results to the client. This approach wouldn’t work for our simple web site, because from the server’s perspective, our site was static — just a few HTML files, a little bit of JavaScript, and about 600 videos. Using Google to search the site wouldn’t work either, because most of the searchable content is located on two pages, with hundreds of items on each page. So what to do?

Read the rest of this entry »

CHI 2013 Video Previews are live!

April 20th, 2013 by

CHI 2013 logoYou might remember a while ago, we solicited some examples of videos for the Video Preview program for CHI 2013. Well, it took a while, but the CHI 2013 Video Previews web site is now live.

The Video Previews are a new feature for the CHI Conference series, replacing the long-running CHI Madness daily plenary session to save time in the over-crowded schedule. But really, the Video Previews is more than just a reason to sleep in a little longer: the goal is to make it easier to understand what the presentations are about, before, during, and after the conference.

The previews were intended to serve multiple purposes:

  • To provide a preview of what will be presented at the conference, so that attendees could plan their schedule
  • To be played on-site on large displays throughout the conference venue to give people an idea of what’s coming up next
  • To be distributed to attendees as part of the electronic proceedings on the USB stick, and on the iPhone and Android apps.

Read the rest of this entry »

Details, please

April 15th, 2013 by

At a PARC Forum a few years ago, I heard Marissa Mayer mention the work they did at Google to pick just the right shade of blue for link anchors to maximize click-through rates. It was an interesting, if somewhat bizarre, finding that shed more light on Google’s cognitive processes than on human ones. I suppose this stuff only really matters when you’re operating at Google scale, but normally the effect, even if statistically-significant, is practically meaningless. But I digress.

I am writing a paper in which I would like to cite this work. Where do I find it? I tried a few obvious searches in the ACM DL and found nothing. I searched in Google Scholar, and I believe I found a book chapter that cited a Guardian article from 2009, which mentioned this work. But that was last night, and today I cannot re-find that book chapter, either by searching or by examining my browsing history. The Guardian article is still open in a tab, so I am pretty sure I didn’t dream up the episode, but it is somewhat disconcerting that I cannot retrace my steps.

Read the rest of this entry »