Monday, August 23, 2010

ISMAR 2010 Demo Video

Finally, my paper is accepted to ISMAR 2010, a very nice conference in Augmented Reality field. Here are my demo videos.

Reference: W.Lee, Y. Park, V. Lepetit, W.Woo, "Point-and-Shoot for Ubiquitous Tagging on Mobile Phones," International Symposium on Mixed and Augmented Reality (ISMAR), 2010.

You can find more screenshots here.

PC version implementation

iPhone version implementation

Thursday, July 8, 2010

3rd Day of ISUVR 2010 - Panel talk

The invited speakers and Sebastien Duval had a panel talk discussing the definitions and applications of digital ecosystem.

3rd Day of ISUVR 2010 - Paper Session 2

Andreas Duenser et al. presented the paper 'Evaluation of Tangible User Interfaces for Desktop AR'.

In the paper they discussed how to evaluate tangible user interfaces in an AR desktop environment. Andreas introduced a TUI named Slider interface designed for AR.
The user studies revealed :
  • Paddle interface has problems of jittering in tracking, and selection is difficult with it.
  • Mouse interface is the fastest interface in task completion speed, but it suffers from the confusion of forward/backward movement in AR nvironment.
  • Slider interface is difficult to select a specific value and it is better for selection of a relative value.
  • All interfaces show the same accuracy.

Muhammad Rusdi Syamsuddin presented the paper 'Research on Virtual World and Real World Integration for Batting Practice' .

The paper is about a system relating the real and the virtual world through pitching and batting actions in a baseball game. The scenario is as follows: The pitching data of a professional baseball player available on MLB website is simulated in the virtual space and the user in the real world becomes a batter with the interface of Wiimote. The idea of the system is very interesting.

3rd Day of ISUVR 2010 - Invited Talk 3

Joshua Harlan Lifton gives his talk 'Consumer Adoption of Cross Reality Systems'.

He presented about connecting the real and the virtual world through sensor networks. The real world events are reflected the the virtual world with different types of sensors (sound, network flow, usage of electoronic power, etc.). Many of his work is based on 3D virtual space, Second Life, which is good for visualization of an event.

However, 3D virtual space is not all. Building services on the existing consumer technologies is another future extension.

2nd Day of ISUVR 2010 - Invited Talk 2

Seunghee Lee gives us a presentation about 3D modeling and animation for AR.

There are two major methods for motion creation, keyframe-based and data-driven methods. His interest is in Physics-based animation, which adopts the physicial properties of an object. The physics-based method can create complex motions that are very very difficult to create through keyframe-based methods.

But physics are not all for complex motion. We need one more thing, motor control principles for realistic motion.

The questions in this area are:
  • What is a realistic and robust motor controller
  • How to compute complex motions fast.
To create realistic motion/animation, a keyframe-based animation made by hand is corrected by using physical analysis. In visual, it is not easy to see difficulties between the hand-made and the corrected animation. But the point is that hand-made animations take much time to create, whereas the physics-based approach let us make it instantly.

Wednesday, July 7, 2010

2nd Day of ISUVR 2010 - Invited Talk 1

Gehard Reitmayr gives a talk, entitled 'Panoramic Mapping and Tracking on Mobile Phones'.

He mentioned about the approach connecting the model generated by SLAM and the real space (using sensors, recognition techniques..).

PanoMT is a panorama tracking method running on mobile phones in 30 fps. It uses sensor data for camera's rotations in roll and pitch, while also usingfeature tracking on images in multi-scales.
Interesting features of PanoMT are :
  • Panorama correction: Aligning false panorama esitmation by correcting the cylinderical projection.
  • Loop closing: RANSAC matching is used to register the image over 360 degrees.
Sensors are noisy and calibration is required. For example, accelerometers and compasses have transient and local disturbances and thus sensor calibration is inevitable, whereas visual tracking provides accurate pose relative to a model. Hybrid tracking using both reduces errors in sensor values

Gehard discussed using maps instead of frame to frame detectioni/tracking, which is computationally expensive, is better : reducing redundency, less data, and slow changes. Template matching with Walsh transformation and NCC is used to map the selected template and the panorama.

As a last topic, he mentioned visualization issues, called image-based ghostings. Simple overlaying of virtual objects occludes the real scenes providing many visual queues. So the question 'which information has to be presented ?' comes here.
His solution to the problem is finding out clues from the image. Through image analysis and user interactions, the ghosting becomes possible. For better perofrmance panoramic remapping is used again.

He conclude the talk with future directions of the current work: extending to 6DOF tracking, object detection to link with applications.

2nd Day of ISUVR 2010 - Paper Session 1

Changhyeon Lee et al. presented the paper 'Networked Collaborative Group Cheerleading Technology: Virtual Cheerleader Experience', which is about a prototype implementation of collaboration of remote users in a VR space. They used Second Life as the virtual interaction space, where a live video coming from a baseball game stadium is displayed. The users in remote site perform interaction through Wiimote.

2nd Day of ISUVR 2010 - Keynote

The 2nd day of ISUVR started with an invited talk by Anton van den Hengel (The Australian Centre for Visual Technologies) The title was Image-based modelling for augmenting reality.

He said that the current mobile AR browsers shows information overlay on the real world, but the information is not registered to the geometry.
Anton claims that User Created Contents (UCC) are required for ubiquitous AR. For AR 3D contents are better, but there haven't been good UCC tools. With current tools epic effort is required to create 3D contents.

Image-based modeling is a good way to create 3D contents because images contains many queues for 3D modeling. There are two way for image-based 3D modeling. Automatic methods generates 3D models of everything, like a 3D laser scanner. Interactive methods allows user to choose the object to interact with.

He introduced Video Trace, which is an interactive image-based 3D modeling system. It is a very nice interactive tool for 3D modeling (However, from my viewpoint, it requires a little bit labor, moving among video frames, modifying 3D geometries..., to finish modeling).

To help 3D modeling, several features and techniques can be exploited:
  • Lines and curves for geometry modeling
  • Mirroring, extrusion, and dense meshing makes modeling an object easier.
  • Surface fitting helps to align planar surfaces that are on the same infinity plane.

Video trace works on a recorded video seuqence. The same idea is extended to the Live AR modeling, so called In-situ modeling. The video below explains the concept and implementation well.

Anton also discussed that there exists misalignment between the modeled real objects and the synthesized virtual ones. It causes visual defect on the occlusion boundary. This problem can be addressed by graph-cut based segmentation method exploiting color distributions.

Video trace requires much user interaction and sometimes they are painful when working with complex 3D objects. To reduce user interaction, silhouette modeling method is adopted. Silhouette based modeling followed by segmentatation generate a 3D model from the video without interaction.

Very nice talk, thank you Anton !

Tuesday, July 6, 2010

1st Day of ISUVR 2010

Today, International Symposium on Ubiquitous VR 2010 started.

Andreas Duenser from HITLab NZ gave a good tutorial about user evaluation to attendee. In his tutorial, he explained :

  • Why user evaluation is required.
  • How to do user evaluations.
  • Different evaluation methods
  • How to interpret the evaluation results.
  • Papers related to user evaluation

The tutorial was very useful, because we always have a question like "Is this really useful ?" or "Is this better than others ?" when we do something new as engineers.

Wednesday, June 16, 2010

Marker tracking demo on iOS 4 GM Seed

As announced in WWDC2010, iOS 4 supports Full Camera Access. I made a simple marker tracking demo using the new camera API. It was very simple and straightforward to use.

Try AVCaptureSession class for iPhone Camera Access !!

Tuesday, April 27, 2010

iPhone camera access on OS 3.x : A workaround

Please note that all the consequences of reading this post and trying the codes below are on your own.

Recently, I found a workaround for grabbing raw image data from iPhone's camera on OS 3.x.

As I previously posted, using PLCameraController does not provide a concrete solution for AR since the captured image contains overlaid contents (a related post here). And, the problem have left unsolved.

But a guy left a comment that using two windows can solve the problem (I don't know his/her name, because the comment was written by anonymous). The idea is somehow weird because we usually makes only a window for a iPhone application, but there are other people saying that using two windows is a workaround for the problem on the Internet. So, I did some digging with the idea and eventually figured out how to do it.

As you may know, iPhone OS4 may allow full camera access for developers. I'm using iPhone for my research, so this method is useful until OS4 comes.

Now, I explain how to do it.

First, we need two windows. Just add one more UIWindow to your application delegate. We will add the PLCameraController's previewView to previewWindow.

UIWindow *window;
UIWindow *previewWindow ;

Then, go to the 'applicationDidFinishLaunching' method. Make the main window transparent to make the view that will be added to the previewWindow visible.

// Make the window background transparent
window.opaque = NO ;
window.backgroundColor = [UIColor clearColor] ;

Initialize the main view. In my case, the view will be an OpenGL view. I make the OpenGL view transparent to see the video background. If you don't know why, please refer my previous post about making video background without texture mapping.

glView = [[EAGLView alloc]
initWithFrame:CGRectMake(0, 0, video_width, video_height)
preserveBackbuffer:NO] ;
glView.opaque = NO ;
glView.backgroundColor = [UIColor clearColor] ;
[window addSubview:glView] ;

Initialize PLCameraController and the previewWindow.

if(camController != nil)
[camController release] ;
// Initialization of PLCameraController
camController = ..... ;

previewWindow = [[UIWindow alloc] init] ;
previewWindow.autoresizesSubviews = YES ;

Add the previewView of PLCameraController to the previewWindow and make the previewWindow visible. The size of 'preview.frame' becomes the size of image we will get from the camera. In this post, I just used (320,426).

UIView * preview = [camController previewView] ;
preview.frame = CGRectMake(0,0, 320, 426) ;
[previewWindow addSubview:preview] ;
[previewWindow makeKeyAndVisible] ;

Finally, make the main window visible.

[window makeKeyAndVisible];

Ok, run your application, then you will see a video preview. To retrieve RGBA data from the camera, a well known method is enough, something like this.

CoreSurfaceBufferRef coreSurfaceBuffer =
[cameraController _createPreviewIOSurface];
if (!coreSurfaceBuffer) return;
Surface *surface =
[[Surface alloc]initWithCoreSurfaceBuffer:coreSurfaceBuffer];
[surface lock];
previewHeight = surface.height;
previewWidth = surface.width;
previewBytesPerRow = previewWidth*4;
pixelDataLength = previewBytesPerRow*previewHeight;
void *pixels = surface.baseAddress;
int sufaceBytesPerRow = surface.bytesPerRow ;

// Copy the pixels to your buffer
memcpy(Your_buffer, pixels, pixelDataLength) ;

Here is a screen shot of my simple AR application, where a virtual character is synthesized on a real scene.

Note that the workaround I explained is not perfect. The limitation of this method is that the retrieved image sometimes have misaligned scanlines. Another problem is auto-focusing on 3GS. The focus rectangle appearing when the camera gives a focus is captured in the retrieved image data. I just couldn't disable it.

So, iPhone becomes closer to a good device for AR and with OS4, there will be a plenty of AR Apps in the store.

Update: To disable the focus rectangle, you need to implement PLCameraController's delegate method cameraControllerReadyStateChanged. So, set a delegate object to the PLCameraController and implement the method like this:

- (void)cameraControllerReadyStateChanged:(NSNotification *)aNotification
[cameraController setDontShowFocus:YES] ;

Then you will not see the focus rectangle. Thanks for your comment Arrix !

Monday, February 22, 2010

AR game running on Samsung's Bada

An AR tower defense game running Samsung's bada platform is demonstrated. Fiducial markers are tracked on a mobile phone and contents are synthesized on the video. Well, the hardware seems to be pretty good and capable of doing AR things.

I think Samsung should show a extraordinary application that makes itself something special among many mobile phone OSs. So, mobile augmented reality can be a good candidate ;-)

Monday, February 8, 2010

ESM algorithm library is released !!

ESM (Efficient second order minimization) algorithm library is released by the authors. Currently, binary build of the library is available on Windows (.dll) and Linux (.a). For more details, visit its website to download it.

Sunday, January 10, 2010

A good tutorial about using 'cameraOverlayView' feature for Augmented Reality on iPhone

Update: This method is no longer useful because Apple provides camera APIs now. 

Here is a nice example that showing how to use a new feature, cameraOverlayView, for AR applications on iPhone. There is a example code, so that you can easily learn how to do AR on iPhone.

In the post, what the author try to do is grabbing a screenshot through UIGetScreenImage() and analyze it. UIGetScreenImage() just provide a screenshot, so that the overlaid contents remain on the screenshot. The author of the post try to remove the contents drawn on the overlaid view by simple interpolation.

I downloaded the example and tested it. The below image is screenshot obtained by UIGetScreenImage(). You see the overlaid green marks of edges.

And, after interpolation, the code gives the following image. It looks OK for now. You may see seams in the image, which is the interpolated pixels.

Another example:

However, this approach may not work if we draw large objects like a cube since we cannot remove the cube through interpolation. Thus, this approach is definitely not good for tracking either. Another problem is interpolation, which takes some time to do it. It may take too much time on mobile phones.

Sunday, January 3, 2010

iPhone Game Programming Tutorials with Videos

Nice tutorials and videos.