Camera-Based Recognition / Retrieval of
Characters and Documents in Real-Time
with Large-Scale Databases
Koichi Kise (Osaka, Japon)
In my talk, I will introduce two real-time
methods of camera-based document analysis.
One is a method of document image retrieval
called LLAH (locally likely arrangement hashing).
This technique is currently capable of dealing
with a database of up to 50 million pages
indexed by using 10 billion feature vectors.
Taking as query a camera-captured part of a page,
it can accurately retrieve a corresponding page
(more than 98%) in real-time (less than 150ms/query).
In addition, the method allows us to estimate
the perspective transformation between the
camera-captured query and the corresponding database
image. This enables us to superimpose any images
onto the retrieved page.
The other is a method of real-time character
recognition which won the best paper award
at DAS2010. One of the characteristic points
of this method is that the recognition is
done irrelevantly to the layout of characters.
Thus characters laid out circularly and on wavy
lines can be recognized. This is an application
of a hashing technique with the help of a new
efficient indexing scheme of shape. Poses of
recognized characters are also estimated.
I will bring with me demos of the above
methods. One is an implementation of LLAH
with a database of 1 million pages. The other
is a demo of real-time character recognition.
Both work in real-time with a normal laptop computer.