Project Presentation-Virtual Keyboard

Published on February 2017 | Categories: Documents | Downloads: 34 | Comments: 0 | Views: 182
of 47
Download PDF   Embed   Report

Comments

Content

SRI RAMAKRISHNA ENGINEERING
COLLEGE
(An Autonomous Institution ,Affiliated to Anna University , Chennai)
An NBA Accredited Institution
Vattamalaipalayam,N.G.G.O. Colony Post,
Coimbatore-641022
Phone : 0422 - 2460088, 2461588

PROJECT PRESENTATION
NAME
: JAYACHANDRAN.R, KARTHIKPRABHU.S
DEPARTMENT : COMPUTER SCIENCE ENGINEERING
REGISTER NO
: 1301166, 1301160
SEMESTER
:2rd yr, 3rd sem
COLLEGE NAME: SRI RAMAKRISHNA ENGINEERINGCOLLEGE

1

“WEBCAM BASED VIRTUAL KEYBOARD”
LIST OF FIGURES
NAME OF THE FIGURE

 Need for virtual keyboard
 Problem Statement
 Existing system
 Drawbacks
 Virtual keyboard method and Device
 Visual Panel
 Finger-Joint Gesture Wearable Keyboard
 Thumbcode
 Chording Glove
 FingerRing
 TouchStream
 Multi-Point Touchpad
 VKey
 VKB Projection
 Scurry
2

 Senseboard
 Comparison between Traditional Keyboard and
Virtual Keyboard
 Structure of the Virtual Keyboard system
 Practical Setup
 Use Case Diagram
 Sequence Diagram for Image Acquisition
 Sequence Diagram for Interrupt Detection in Frame
Sequence
NAME OF THE FIGURE
 Sequence Diagram for Finger Extraction using
o Threshold Algorithm
 Sequence Diagram for Finger Tip analysis by using
Edge detection
 Sequence Diagram for Key Extraction
 Class Diagram for Virtual Keyboard
 Activity Diagram for Virtual Keyboard
 Block Diagram for Image Acquisition
 Snapshot of Image Acquisition
 Block Diagram for Interrupt Detection in Frame
Sequence
 Snapshot of Interrupt Detection
 Block Diagram for Finger extraction using Threshold
3

 Algorithm
 Snapshot for Finger extraction using Threshold
 Algorithm
 Snapshot of Finger tip analysis by Edge Detection
 Method
 Snapshot Key Extraction
 WinRunner Testing Result
 WinRunner Testing Code
 System Testing
 Conclusion
 Screen Shot for Choosing the Environment
LIST OF SYMBOLS
SYMBOL NAME
P(C/x)
p(C)
P(x/C)

DESCRIPTION
Posterior Probability
Prior Probability
Conditional Probability

p(X)

Feature Vector

max P (C\x)

Highest Posterior Probability

C

Array of Pre-stored Key elements

4

CHAPTER 1
INTRODUCTION
1.1

NEED FOR VIRTUAL KEYBOARD
Historically, keyboards are designed for text character printing and text character entry

later on in the attached devices. Therefore, keyboard is communication mechanism between
human beings and the computing machines and is used to enter textual data and commands.
Virtual keyboard are generally assumed to produce the same output as the conventional
QWERTY layout produces. The utilization of virtual keyboard appears in space saving situations
or requirement in soft programmability of keys or systems avoiding mechanical failure or in
movement situations where usability of standard keyboard is limited. Utilization of virtual
keyboards in space saving situation is enormous. (e.g.) Text entry in PDA’s and cellular phones
etc. PDA’s and cellular phones don’t have a standard keyboard available with them due to limited
key on their text entry keypads. Soft key programmability of keyboard is also a considerable
factor in the design of virtual keyboards. Similarly virtual keyboards have been designed without
mechanical keys. These kinds of keyboards are utilized in hostile environments. Virtual
keyboards find their position in transport environments. E.g., rail, plane or automotive. Virtual
keyboards are also designed for public kiosks and here those designs are suitable which avoids
mechanical failure. Industrial environments and medical environments also require the special
requirements from keyboard designs.

1.2

PROBLEM STATEMENT
To design a vision based Virtual Keyboard which detects interrupt as key recognition

instead of mechanical transducer operations of key pressing. Mono-vision video of hand posture
for pressing the keys is analyzed. The analyzed hand posture is taken into account under various
5

transactions to estimate the key pressed. Mechanical transducers does two operations for key
estimation (key press and key release concepts), while Virtual Keyboard requires only key press
operation to estimate the key and not key release operation.

CHAPTER 2
SYSTEM ANALYSIS
2.1

EXISTING SYSTEM
Current virtual keyboards design appears in various forms such as finger-joint wearable

sensor gloves, thumb code, accelerometer based inputs, laser projected keyboards and gyroscope
based sensing. Each virtual keyboard has certain design characteristics. However, performance
parameters for evaluation of keyboards are same as number of discrete keys, response time and
failure rate. Other parameters included can be the ability to remap, key symbol mapping and
space requirements.

2.1.1 DRAWBACKS


The cost of the most successful Virtual Keyboard design is high due to design and
expensive technology.



2.2

The set up cost ranges from Rs.5000 and above.

VIRTUAL KEYBOARDS: METHODS AND DEVICES
The following are the various types and methods of existing Virtual Keyboard system,

2.2.1 VISUAL PANEL
The Visual Panel [15] consists of a camera and a sheet of paper as shown in figure 2.1.
The location of the extended index finger in reference to the paper is located with computer
vision means. The primary application is a mouse pointer, clicking is achieved by resting the
fingertip in its current position for three seconds. The authors demonstrated text entry by
interpreting pointer locations as the keys of a keyboard, which were printed on the sheet of paper.

6

An audible notification signals the recognition of a character after the 3 second significance
interval.

Figure 2.1. Visual Panel

2.2.2 FINGER-JOINT GESTURE WEARABLE KEYPAD
The Finger-Joint Gesture Wearable Keypad (FJG) [3] suggests viewing the phalanges of
the fingers (besides the thumb) of one hand as the keys on phone keypad as shown in the figure
2.2. The thumb is used to press the virtual buttons. Thumbcode relies on word disambiguation to
produce more than 12 characters. Yet the drawback of key-to-symbol mapping might be
mitigated by the familiar layout. Also, less complex hand configurations might be less tiring for
the user. Just as Thumbcode, FJG has no user feedback method beyond skin contact sensations.

Figure 2.2. Finger-Joint Gesture Wearable Keyboard

2.2.3 THUMBCODE
The “Thumbcode” method [14] described in figure 2.3 which defines the touch of the
thumb onto the other fingers’ phalanges of the same hand as key strokes. Consequently there are
12 discrete keys (three for each index, middle, ring finger and pinky). To produce up to 96
different symbols, the role between keys and operators is broken up: The four fingers can touch
7

each other in eight different ways, each basically representing a mode, or modifier key that
affects the mapping for the thumb touch. Tactile user feedback is implicit when touching another
finger with the thumb. A glove implementation was tested by the author.

Figure 2.3. Thumbcode

2.2.4 CHORDING GLOVE
The Chording Glove [9] employs pressure sensors for each finger of the right hand in a
glove to implement a chording input device. Almost all possible finger combinations are mapped
to symbols, making it potentially hard to type them. Additional “mode switches”, located along
the index finger, are used to produce more than the 25 distinct characters as in figure 2. 4. Yet
user experiments suggest otherwise: rates of up to 19 wpm are achieved after ten training
sessions “with no signs of leveling off”.

Figure 2.4. Chording Glove

2.2.5 FINGERING
FingeRing [4] uses accelerometers on each finger to detect surface impacts. In the
wireless version depicted in the figure 2.5 these rings communicate with a wrist-mounted data
processing unit. The interaction method is designed for one handed use, but could be extended to
8

two hands with obvious implications. In the current version, the finger movements to produce
one character are extensive: two chording patterns have to be typed within a time interval, each
consisting of a combination of fingers hitting the surface. Due to this piano-style typing method,
users with prior piano experience fare much better with this device; in fact, the full 2-stroke
chord mapping is rendered too difficult for novice users.

Figure 2.5 FingeRing

2.2.6 TOUCHSTREAM
The TouchStream keyboard stretches our definition of a VK as it has keys printed on the
surface. Yet the underlying technology permits software configuration of the sensed areas, equal
to the multi-point touchpad. Despite conventional touch-typing the TouchStream affords a
number of chording patterns as alternatives to modifier keys. These patterns are pressed by one
hand (anywhere on the pad) while the other touches the key that is to be modified. Picture of the
TouchStream keyboard is shown in the figure 2.6.

Figure 2.6. TouchStream

2.2.7 MULTI-POINT TOUCHPAD
DSI Datotech Systems offers one of the few touchpads that reports up to ten surface
contacts and their pressure forces independently and simultaneously [1]. While it has not been
implemented yet, one could use the 20x15cm large device to transfer the traditional keyboard
modality in a one-to-one fashion to an interactive, software-configurable surface as in figure 2.7.
9

Inherent to this device are the same user feedback methods as for any of the devices employing
tabletop units: finger surface impacts.

Figure 2.7. Multi-Point Touchpad

2.2.8 VKEY
Virtual Devices Inc. recently announced a combined projection and recognition VK [12].
Little is known about this device, but their press release suggests that visual sensors (cameras)
detect the movement of all ten fingers as shown in figure 2.8. Just as the VKB device, the VKey
also consists of a tabletop unit and feedback is the tactile sensation of hitting a surface.

Figure 2.8.VKey

2.2.9 VKB PROJECTION
The virtual keyboard technology developed by VKB [13] is a tabletop unit that projects a
laser image of a keyboard on any flat surface. Infrared cameras detect key strokes of all ten
fingers as shown in the figure 2.9. Word disambiguation techniques are employed despite this 1
DOF mapping. Therefore, our guess is that engagement of all distinct key locations is detected,
yet with a fairly low accuracy. These two characteristics in combination should result in fairly
good recognition rates. Surface impact of the fingers serves as typing feedback.
10

Figure 2.9. VKB Projection

2.2.10 SCURRY
Tiny gyroscopes on each finger are the sensing technology in Samsung’s Scurry [10].
The prototype suggests that these finger rings communicate with a wrist-mounted unit where the
data is processed. Finger accelerations and relative positions are detected, making it possible to
distinguish multiple key targets per finger. Pictorial representation of Samsung’s Scurry is shown
figure 2.10. A surface impact is required to register a keystroke, making for the primary sensory
feedback to the user. Little LEDs on the rings potentially provide additional feedback.

Figure 2.10. Scurry

2.2.11 SENSEBOARD
The Senseboard [11] consists of two rubber pads that slip onto the user’s hands. Muscle
movements in the palm are sensed (with unspecified, non-invasive means) and translated into
key strokes with pattern recognition methods. The only feedback other than characters appearing
on a screen comes from the tactile sensation of hitting the typing surface with the finger as in
figure 2.11.

11

Figure 2.11 Senseboard

12

CHAPTER 3
SYSTEM REQUIREMENTS
3.1

HARDWARE REQUIREMENTS

Processor

: Minimum o f Intel Pentium

RAM

: 256 MB

Memory Space

: 100 MB (fully for this Application)

Webcam

: Pixel ratio of 320 x 240

3.2

SOFTWARE REQUIREMENTS

Operating System

:

Windows Operating System

Media Software

:

Java Media Framework

13

CHAPTER 4
LITERATURE SURVEY
4.1

GESTURE
Gesture may be defined as the physical movement of hands, arm, face or body with the

intent to convey the information or command. Gesture recognition consists of tracking human
movement and interpretation of that movement as semantically meaningful commands.
There are three types of gesture. They are mimetic, deictic, and arbitrary. In mimetic
gestures, motions from an objects main shape or representative feature. These gestures are
intended to be a transparent. Deictic gestures are used to point at the important objects and each
gesture is transparent within its given context. These gestures can be specific, general or
functional. Specific gestures refer to one object. General gestures refer to class of objects.
Functional gestures represent intentions such as pointing to a chair to ask permission to sit.
Arbitrary gestures are those whose interpretation must be learned due to their opacity. Although
they are not common in a cultural setting.

4.2

GESTURE ARCHITECTURE
Like any other pattern recognition system, gesture recognition consists of three

components.
Gesture acquisition and preprocessing
Gesture feature extraction and representation
Gesture recognition and classification

4.2.1 GESTURE ACQUISITION AND PREPROCESSING
Variety of transducers or are available for gesture acquisition ranging from discrete
components such as magnetic sensor, position sensor, accelerometer sensor to fully developed
functional units such as gesture gloves and more sophisticated systems such as mono vision,
stereo vision and range sensors etc. Technologically these sensors employ different methodology
ranging from basic physics principles to sophisticated imaging and lasers.

14

Vision sensors are installed in mainly two configurations they are mono-vision and stereo
vision. Mono vision sensors incorporate one sensing camera naming CCD (charge coupled
device) or CMOS (complementary metal oxide semiconductor) with multiple possible interfacing

such as USB 2.0, Camera link, Ethernet, etc, for their video signal transmission. Similar kind of
acquisition sensors are utilized for stereo vision. However, the primary difference exists in the
further interpretation of stereo-imaging.

4.2.2 GESTURE FEATURE EXTRACTION AND REPRESENTATION
Feature extraction for the purpose of gesture recognition consists of segmentation of
image components that contribute to the formation of gesture inputs. Both raster (skin tone blobs,
colored gloves, etc.) and vector information (joint geometry, facial animation parameters, etc..)
from the basis of feature extraction. Some commonly features for gesture recognition are given
below:
Image moments
Skin tone blobs
Colored markers
Geometric features
Multi scale shape characterization
Motion History Images (MHI)
Motion Energy Images(MEI)
Shape Signatures
Polygonal approximation-based Shape Descriptor
Shape descriptor based upon regions and graphs

4.2.3 GESTURE RECOGNITION AND CLASSIFICATION
Finally, gesture recognition and classification stage classifies the reported features
belonging to certain pre-stored category. Following are the list of gesture recognition or
classification methods proposed in the literature so far:
Hidden Markov Model (HMM)
Viola and Jones algorithm
Dynamic Programming
Fuzzy Interference Engine
Bayesian Classifier
Template Matching

15

4.2.3.1 HIDDEN MARKOV MODEL
Dynamic gestures prolong over certain duration of time, due to which gestures usually
appear in the form of sequences or spatiotemporal information. Ultimately some kind of
sequences or spatiotemporal matching process is required for successful gesture recognition.
Theoretically, a sequence can be characterized as being generated by some parametric random
process can be successfully accomplished by Hidden Markov Model (HMM).

4.2.3.2 DYNAMIC PROGRAMMING
Dynamic programming is the method of solving the problems exhibiting the properties of
overlapping sub-problems and optimal sub structure that takes much less time than naïve
methods. The term was originally used in the 1940’s by Richard Bellman to describe the process
of solving the problems where one needs to find the best decisions one after another. By 1953, he
had refine this to modern meaning. The word “Programming” in “Dynamic Programming” has
no particular connection to computer programming at all. A program is, instead the plan for
action is produced. For instance, a finalized schedule of events at an exhibition is sometimes
called a program. Programming in the sense, finding the acceptable plan of action.

4.2.3.3 BAYESIAN CLASSIFIER
Bayesian classifier is one of the basic classification methods from supervised
classification category. Baye’s rule is stated as follows:
P(C/x) = (p(C) x p(C/x)) /p(x)

(1)

where p(C) is the prior probability, which in particular can be said the prior probability of
gesture. P(x\C) is the class likelihood and is the conditional probability that an event belonging
to see has observation x. Likelihood for gesture recognition can be specified as the conditional
probability that gesture belonging to class C as the feature vector x. p(x) is the evidence in the
sense that a particular feature vector for some gesture appears with this probability. Finally
posterior probability p(C/x) is calculated by combining the prior. Likelihood and evidence. For
multiple classes, the posterior probability can be calculated as

16

Finally for the minimum error Bayesian classifier selects the class with the highest
posterior probability i.e.
Select C if P (C\x) = max P (C\x)

(2)

4.2.3.4 TEMPLATE MATCHING
Template Matching is most general class for recognition of a pattern in the pre-stored
patterns . Pattern matching operates at both levels raw shapes are recognized or features are
initially extracted and later on matched with the pre-stored classes. Pattern matching operates
both in spatial as well as in the frequency domain.

4.2.3.5 VIOLA-JONES OBJECT DETECTION FRAMEWORK
The Viola-Jones object detection framework is the first object detection framework to
provide competitive object detection rates in real-time proposed in 2001 by Paul Viola and
Michael Jones. Although it can be trained to detect a variety of object classes, it was motivated
primarily by the problem of face detection. This algorithm is implemented in OpenCV as
cvHaarDetectObjects().

4.3

REAL TIME

MONO VISION

GESTURE

BASED

VIRTUAL

KEYBOARD
Real Time Mono Vision Gesture Based Virtual Keyboard System paper presents a novel
mono-vision virtual keyboard design for consumers of mobile and portable computing devices
such as PDA’s, mobile phones etc. Fuzzy for each symbol (rather than cording methods)
inherently approaches to gesture recognition are developed to reveal the realization of soft
keyboard. Key pressed over the printed sheet keyboard by analyzing the hand and finger gesture
captured in the video sequence. Real time system is developed by integrating camera with PDA
in the application environment. Reliable results are experienced by the implementation o f the
proposed real time mono vision gestured virtual keyboard system.
In this project a camera for video capture. Novel gesture recognition based virtual
keyboard system is designed. A gesture may be defined as the physical movement of hands, arm,
17

face and body with the intent to convey information or command. Gesture recognition consists
the tracking of human movement and interpretation of that movement as semantically
meaningful commands, Gesture recognition has the potential to be a natural and powerful tool
for intuitive interaction between the human and computer. Gesture recognition has been
successfully applied in virtual reality, human computer interaction, game control, robot
interaction, remote controlling of home and office appliances, sign language, activity
recognition, human behavior, and training systems etc. Gesture recognition system is designed in
four stages: gesture acquisition, feature extraction, classification, and learning. Gesture
acquisition is accomplished by position sensors, motion / rate sensors, and digital imaging.
Feature extraction and classification are real time stages to analyze the acquired gesture while
learning stage is off-line activity to learn the relationship between gesture and information or
command. a novel gesture recognition based virtual keyboard system which replicates the
transducer based keyboard system. Gesture acquisition is accomplished by a mono vision sensor.
Suppose the output of the keyboard system is defined as
C= {c1, c2, ...,cL} where c1 =’A’, C2=’B’ etc… where as L=63, it’s the total no of keys on the
keyboard.
Key stroke hand
Movement

Key stroke Hand
Movement

Transducer Action

Character emitted

Traditional keyboard

Gesture analysis

Character emitted

Gesture Based Virtual
keyboard

Figure 4.1 Comparison between Traditional Keyboard and Virtual Keyboard
In a traditional keyboard, transducer action is performed in electro-mechanical switch
function fashion. While mono vision gesture based virtual keyboard analyzes the hand and finger
gestures in the video sequence. Concept of making key stroke by both traditional keyboard and
gesture based virtual keyboard are similar. Comparison of actions between traditional keyboard
and gesture based virtual keyboard are shown in the figure 4.1.
18

Hand video is captured continuously, Then the video is disassemble into the frames. A
template is pre-stored in a database for the symbol ‘A’,’B’, etc. Such that the frames will
compare with the pre-stored data, stored in a database, when it is matched the corresponding
symbol will be printed.

CHAPTER 5
PROPOSED SYSTEM
5.1

PROPOSED WORK

19

In this project, a new perspective to view the problem of Virtual Keyboard is done by
using a simple mono vision camera. As the project is fully based on software centric and not
hardware centric, therefore project cost is drastically low. The proposed model of the system is
shown in the figure 5.1.
Steps involved in the proposed system:
 Human hand finger movement is analyzed from a video sequence.
 Background objects are eliminated (other than the finger) using Threshold algorithm.
 Finger tip is analyzed and processed using Edge Detection Method.
 Finally, key is evaluated from the process of Edge Detection Method.

WebCam
Capturing Key Strokes in the form of
Video
Separate Video into Frames
Identify the odd frame (varies maximum) from the frame
sequence
Convert frame into Binary Image by Thresholding and
extract the finger
Find the edge(x,y) of the finger from the obtained binary
image

Eee character for the
Extract the relevant
coordinate(x,y)
Figure 5.1 Structure of the Keyboard system

5.2

PICTORIAL REPRESENTATION

20

A Camera is mounted over any desired location. But the only criteria are that the camera
should focus the entire keyboard layout. Practical representation of the Virtual Keyboard System
setup is shown in the figure 5.2.

Practical Setup

Figure 5.2. Practical Setup

5.3

MODULES
21

Proposed Virtual Keyboard system includes the following modules.
1. Image Acquisition
 Video capture
 Frame extraction
2. Interrupt Detection from Frame sequence
3. Finger extraction using Threshold Algorithm
4. Finger tip analysis by Edge detection Method
5. Key Extraction
5.3.1 IMAGE ACQUISITION
Image acquisition is the process of capturing the human hand finger movement over the
keyboard layout in a video format. Interface is created in-between the computer and the simple
mono vision camera. Once when the model is executed, the interface connects the application
with the Web cam and sequence of frames i.e., video stream is obtained.

5.3.2 INTERRUPT DETECTION FROM FRAME SEQUENCE
Video stream (frames) is compared sequentially whenever the frame is processed which
varies much from the other still processed. Then grab that image and use it for next process.
(Example - whenever key stroke is done image totally varies from its background such frame is
grabbed).

5.3.3 FINGER EXTRACTION USING THRESHOLD ALGORITHM
Thresholding is the simplest method of image segmentation. From a grayscale image,
thresholding can be used to create binary images. Frames are analyzed continuously. When any
interrupt occurs, say human finger, the algorithm is made to captures the particular frame and
background elimination occurs. Only the interrupt is shown after converting it into a grey scale
image.
22

5.3.4

FINGER EDGE DETECTION MECHAANISM
Human finger tip or edge is analyzed using the Edge Detection Method. Edge detection is

a fundamental tool in image processing and computer vision, particularly in the areas of feature
detection and feature extraction, which aim at identifying points in a digital image at which the
image brightness changes sharply or, more formally, has discontinuities. Using vector calculation
method, if the estimated finger tip is located in the particular vector region then the desired key is
estimated.
5.3.5

KEY EXTRACTION
When key is evaluated, it should make readily available with all the text editors and other

applications. This is done by overriding the evaluated key through hardware. Robot package in
java does this process easily. Importing Robot package, Robot class is easily called with an
object. This object is called by two default functions such as keyPressed and keyReleased with a
single integer argument.

5.4

ADVANTAGES


Opens a new door to keyboard based applications.



Games can make the maximum utilization of the keyboard by displaying only those keys
that are used in the game.



Multilingual support: Since the key-displays are reconfigurable, there is no language
barrier any more.



Touch screen is similar to this implementation, but they do require additional effort and
are not ergonomically comfortable. User doesn’t have to raise his arm to the monitor
every time to use it.



5.5

Keyboards with any resolution can be built according to user’s choice.

APPLICATIONS


Any place where keyboard is used.
23



As an alternative keypad for mobile phones or smart devices which have a frontal
camera. This is possible if the software is converted to J2ME.



As an input device for gaming.



At places where a computer or device has multi-lingual users like in net cafes.

• Highly comfortable usage in areas like ATMs, Hospital Bill Checking, Railway
Reservation Center, etc.

5.6

TECHNOLOGIES USED IN PROPOSED SYSTEM

5.6.1 THRESHOLDING
Thresholding is the simplest method of image segmentation. From a grayscale image,
thresholding can be used to create binary images. During the thresholding process,
individual pixels in an image are marked as “object” pixels if their value is greater than some
threshold value (assuming an object to be brighter than the background) and as “background”
pixels otherwise. This convention is known as threshold above. Variants include threshold below,
which is opposite of threshold above; threshold inside, where a pixel is labeled "object" if its
value is between two thresholds; and threshold outside, which is the opposite of threshold
inside . Typically, an object pixel is given a value of “1” while a background pixel is given a
value of “0.” Finally, a binary image is created by coloring each pixel white or black, depending
on a pixel's labels.

5.6.1.1 THRESHOLD SELECTION
The key parameter in the thresholding process is the choice of the threshold value
(or values, as mentioned earlier). Several different methods for choosing a threshold exist; users
can manually choose a threshold value, or a thresholding algorithm can compute a value
automatically, which is known as automatic thresholding.A simple method would be to choose
the mean or medianvalue, the rationale being that if the object pixels are brighter than the
background, they should also be brighter than the average. In a noiseless image with uniform
background and object values, the mean or median will work well as the threshold, however, this
24

will generally not be the case. A more sophisticated approach might be to create a histogram of
the image pixel intensities and use the valley point as the threshold. The histogram approach
assumes that there is some average value for the background and object pixels, but that the actual
pixel values have some variation around these average values. However, this may be
computationally expensive, and image histograms may not have clearly defined valley points,
often making the selection of an accurate threshold difficult. One method that is relatively
simple, does not require much specific knowledge of the image, and is robust against image
noise, is the following iterative method:
1. An initial threshold (T) is chosen; this can be done randomly or according to any other
method desired.
2. The image is segmented into object and background pixels as described above, creating
two sets:
1. G1 = {f(m,n):f(m,n)>T} (object pixels)
2. G2 = {f(m,n):f(m,n) T} (background pixels) (note, f(m,n) is the value of the
pixel located in the mth column, nth row)
3. The average of each set is computed.
1. m1 = average value of G1
2. m2 = average value of G2
4. A new threshold is created that is the average of m1 and m2
1. T’ = (m1 + m2)/2
5. Go back to step two, now using the new threshold computed in step four, keep repeating
until the new threshold matches the one before it (i.e. until convergence has been
reached).

5.6.1.1ADAPTIVE THRESHOLDING
25

Thresholding is called adaptive thresholding when a different threshold is used for
different regions in the image. This may also be known as local or dynamic thresholding.

5.6.2 EDGE DETECTION MECHANISM
Edge detection is a fundamental tool in image processing and computer vision,
particularly in the areas of feature detection and feature extraction, which aim at identifying
points in a digital image at which the image brightness changes sharply or, more formally which
has discontinuities.
The purpose of detecting sharp changes in image brightness is to capture important
events and changes in properties of the world. It can be shown that under rather general
assumptions for an image formation model, discontinuities in image brightness are likely to
correspond to:


discontinuities in depth,



discontinuities in surface orientation,



changes in material properties and



Variations in scene illumination.

In the ideal case, the result of applying an edge detector to an image may lead to a set of
connected curves that indicate the boundaries of objects, the boundaries of surface markings as
well as curves that correspond to discontinuities in surface orientation. Thus, applying an edge
detection algorithm to an image may significantly reduce the amount of data to be processed and
may therefore filter out information that may be regarded as less relevant, while preserving the
important structural properties of an image. If the edge detection step is successful, the
subsequent task of interpreting the information contents in the original image may therefore be
substantially simplified. However, it is not always possible to obtain such ideal edges from real
life images of moderate complexity. Edges extracted from non-trivial images are often hampered
by fragmentation, meaning that the edge curves are not connected, missing edge segments as
26

well as false edges not corresponding to interesting phenomena in the image – thus complicating
the subsequent task of interpreting the i

CHAPTER 6
PROPOSED SYSTEM DESIGN
6.1

UML DIAGRAMS
Set of fundamental design concepts have evolved over the past three decades. Each one

provides the software designer with the foundation from which more sophisticated design
methods can be applied. The choice of what models and diagrams one creates has a great
influence on how a problem was encountered, and how a corresponding solution is shaped.
Design is defined as a model of the system and continues by converting this model to
implementation of the new system. Every complex system is best approached through a small set
of nearly independent view of the model. Every models can be expressed at different levels. The
best models are connected to reality.
In the design process the following UML diagrams are used.





Use Case Model
Class Diagram
Activity Diagram
Sequence Diagram

6.1.1 USE CASE DIAGRAM
Use cases are scenario for understanding the system requirements. A use case model can
be instrumental in project development, planning and documentation of system requirements. A
use case is an interaction between users and a system; it captures the goal of the users and the
responsibility of the system to its user.

27

The use case model describes the uses of the system and shows the courses of events that
can be performed. In other words, it shows a system in terms of its users and how it is being used
from a user point of view. Furthermore, it defines what happens in the system when the use case
is performed. In essence, the use case model tries to systematically identify uses of the system
and therefore the system’s responsibilities
A use case model also can discover classes and relationships among subsystems of the
systems. A use case model can be developed by talking to typical users and discussing the
various things they might want to do with the application being prepared. Each use or scenario
represents what the user wants to do.
The use case model expresses what the business or application will do and not how. Since
it is called as “what model”.
A use case is a sequence of transactions in a system whose task is to yield results of
measurable value to an individual actor of the system. Since the use case model provides an
external view of the system or application, it is directed primarily toward the users or the
“actors” of the system, not its implementers. An actor is a user playing a role with respect to the
system. Use case diagram of the proposed Virtual Keyboard System is shown in the figure 6.1.

28

V I DEO CAPTURE

FINGER OCCURANCE

USER

BACKGROUND ELIMINATION

SYSTEM

FINGER ESTIMATION

FINGER TIP ANALYSIS

KEY EXTRACTION

Figure 6.1Use Case Diagram

6.1.2 SEQUENCE DIAGRAM
A sequence diagram is made up of objects and messages. Objectives are represented
exactly how they have been represented in all UML diagrams – as rectangles with the underline
in class name within the rectangle. Sequence diagrams map the scenarios described by use cases
in step by step detail to define how objects collaborate to achieve his application’s goals. A
lifeline in a sequence diagram represents an object and shows all its points of interaction with
other objects in events that are important Sequence diagrams are easy and intuitive way of
describing the behavior of a system by viewing the interaction between the system and its
environment. A sequence diagram has two dimensions .The vertical dimension represents time,
the horizontal dimension represents different objects. The vertical line is called the object’s
lifetime. The lifeline represents the object’s existence during interaction. An object is shown as a

29

box at the top of a dashed vertical line. Each message is represented by an arrow between the
lifelines of two objects. The order in which this messages occur is shown top to bottom on the
page. Thus a sequence diagram is very simple and has immediate visual appeal.
Lifelines start at the top of the sequence diagram and descend vertically to indicate the
passage of time. Interactions between objects- messages and replies-messages are drawn as
horizontal direction arrows connecting lifelines. In addition, boxes known as combine fragments
are drawn around sets of arrows to mark alternative actions, loops and other control structures.
Sequence diagram for Image Acquisition module is shown in the figure 6.2.
USER

CAMERA

SYSTEM

FRAME

Request for the process

Select the camera

Capture the video
Video stream
Video divide into frames

Figure 6.2 Sequence Diagram for Image Acquisition
Sequence diagram for Interrupt Detection in Frame Sequence module is shown in the
figure 6.3.

30

SYSTEM

FRAME

INTERRUPT
DETECTION

FRAME
SEPERATION

Divided into frames
Searching for interrupt frame

Processing interrupt frame

Seperated frame

Figure 6.3 Sequence Diagram for Interrupt Detection in Frame Sequence
Sequence Diagram for Finger Extraction using Threshold Algorithm module is shown in
the figure 6.4.
SYSTEM

THRESHOLD
Seperated frame

Convert into binary image

Figure 6.4 Sequence Diagram for Finger Extraction using Threshold Algorithm
Sequence Diagram for Finger Tip analysis by using Edge detection module is
shown in the figure 6.5.

31

SYSTEM

EDGE
DETECTION

VECTOR
VALUE

Binary image

Edge is detected

Calculated vector value

Figure 6.5 Sequence Diagram for Finger Tip analysis by using Edge detection
Sequence Diagram for Key Extraction module is shown in the figure 6.6.
SYSTEM

KEY
EXTRACTION

ROBOT CLASS

Passing the vector value
Analysing the key

Estimate the key code
Print the character

Figure 6.6 Sequence Diagram for Key Extraction

32

SCREEN

6.1.3 CLASS DIAGRAM
In class diagram, there are four classes. They are video to frames, threshold, edge
detectionand keyboard interupt handler. The functions for video to fames are frame extracted,
buffered image, frame saving and operations are save() and frame extracted(). The functions and
operators for threshold are getRGB value, average RGB, binary image and operators are
getRGB() and avgRGB(). The functions and operators for edge detection are get finger tip vector
and vector matching and opertor is get first black(). The functions and operators for keyboard
interupt handler are vector key estimation and hardware overriding and operator is print char().
The Class digram of the proposed system is shown in the figure 6.7.

Video to Frames
Frame extracted
Buffered image
Frame saving

Threshold
Get RGB value
Average RGB
Binary image

save()
frame extract()

get RGB()
avg RGB()

Edge detection
Get finger tip vector
Vector matching
get estimate finger tip()

Keyboard Interupt Handler
Vector key estimation
Hardware overriding
print char()

Figure 6.7 Class Diagram of Virtual Keyboard

6.1.4 ACTIVITY DIAGRAM
An activity diagram is variation or special case of machine, in which the states are
activities representing the performance of operations and the transitions are triggered by the
completion of the operations. Unlike state diagram that focus on the events occurring on a single
object as it responds to messages, an activity diagram can be used to model an entire business
33

process. The purpose of an activity diagram is to provide a view of flows and what is going on
inside a use case or among several classes. However, an activity diagram can also be used to
represent a class’s method implementation as well.
An activity diagram is used mostly to show the internal state of an object, but external
may appear in them. An external event appears when the object is in a “wait state”, a state during
which there is no internal activity by the object and object is waiting for some external event to
occur as the result of an activity by another object. The two states are wait state and activity
state. The Activity diagram of the proposed system is shown in the figure 6.8.

VIRTUAL
KEYBOARD

VIDEO
CAPTURE

DIVIDE INTO
FRAMES

FINGER
OCCURANCE

BACKGROUND
ELIMINTION

FINGER
ESTIMATION

FINGER TIP
ANALYSIS

KEY
EXTRACTED

Figure 6.8 Activity Diagram of Virtual Keyboard

34

CHAPTER 7
PROJECT DESCRIPTION
7.1

IMAGE ACQUISITION
Image acquisition is first step of this project where acquisition process is done by using

simple mono vision camera. Simple mono vision camera is not a special one just like ordinary
camera such as external camera or web cam integrated. Camera should provide atleast 320x240
pixels of image. Camera pixel should be 1.3 or above. More the pixel greater the accurate rate
and decreases the error rate.
Camera default device driver is required for hardware overriding .When multiple cameras
are connected, a window is opened to select required camera. ”vfw://0” is the key word used to
detect multiple connected cameras.
Acquisition is in the form of the video. Video is sequence of multiple frames. When video
splitted into frames, each frames deserved as images 10-15 frames are created for every second.
During the process of image acquisition, all the obtained images are stored in any desired
location if necessary. The block diagram of the Image Acquisition is shown in the figure 7.1 and
Snapshots are on the figure 7.2.

SURFACE

CAMERA

VIDEO
CAPTURE

CONVERTING
VIDEO TO
FRAMES

Figure 7.1 Block Diagram for Image Acquisition

35

FRAMES COMBINES
TOGETHER AND FORMS A
VIDEO

VIDEO INTO
FRAMES

Figure 7.2 Snapshot of Image Acquisition

7.2

INTERRUPT DETECTION IN FRAME SEQUENCE
At regular intervals (about 10 to 15 times every second), the current frame from video is

compared with previous images and grab the image which has maximum variation. The block
diagram of the Interrupt Detection in Frame Sequence is shown in the figure 7.3 and Snapshots
are on the figure 7.4.
Check for maximum
variation in 5 frames

Yes

FRAMES

No

Grab the
Frame

PROCEEDS TO NEXT
LEVEL

Figure 7.3 Block Diagram for Interrupt Detection in Frame Sequence

36

INTERRUPT DETECTED
FRAME

SEQUENCE OF FRAMES

Figure 7.4 Snapshot of Interrupt Detection

7.3

FINGER EXTRACTION USING THRESHOLD ALGORITHM
Image from the preprocessing sector comprises Red, Green and Blue model i.e., RGB

color model. RGB color model is removed from each image and only black and white color
model remains. These two colors are called as binary colors and finally a binary image is created.
Thus by converting a colored image into binary image helps in calculating and analyzing
the fore ground objects. Ultimately background object details are quietly often reduced to greater
level.
Combining all these red >16, green >8, blue>0 produces binary image. The block
diagram of the Finger extraction using Threshold Algorithm is shown in the figure 7.5 and
Snapshots are on the figure 7.6.

37

7.3.1

PSEUDOCODE FOR THRESHOLDING

IMAGE PIXEL

Red >16

WHITE PIXEL
no

yes

Green >8
no

yes
Blue >0
no

yes

BLACK PIXEL

Figure 7.5 Snapshot for Finger extraction using Threshold Algorithm

RGB IMAGE

BINARY IMAGE

Figure 7.6 Snapshot for Finger extraction using Threshold Algorithm

7.4

FINGER TIP ANALYSIS BY EDGE DETECTION METHOD

38

Edge detection is a fundamental tool in image processing and computer vision,
particularly in the areas of feature detection and feature extraction, which aim at identifying
points in a digital image at which the image brightness changes sharply or, more formally which
has discontinuities. The purpose of detecting sharp changes in image brightness is to capture
important events and changes in properties of the world.
The output image from the process of thresholding is in the form of binary image. The
obtained image contains more details about the object and lesser details about the background.
Only black pixels are taken into account. First black pixel layer is analyzed from the binary
image. The vector value of the pixel is calculated. Corresponding value for the vector is prestored. Then the key is evaluated. The figure 7.7 represents the method of Finger tip analysis by
Edge Detection.
7.4.1

PSEUDOCODE FOR EDGE DETECTION METHOD

BINARY IMAGE

RED COLOURED LINE DEPICTS
THE EDGE OF THE FINGER

Figure 7.7 Snapshot of Finger tip analysis by Edge Detection Method

7.5

KEY EXTRACTION
Once when the Virtual Keyboard model is implemented, evaluated key or the model

should act as like as traditional keyboard i.e., when key is evaluated, it should made readily
39

available with all the text editors and other applications. This is done by overriding the evaluated
key through hardware. Robot package in java does this process easily. Importing Robot package,
Robot class is easily called with an object. This object is called by two default functions such as
keyPress and keyRelease with a single integer argument. But in case of this Virtual Keyboard
only keyPress function in sufficient for Hardware overriding. The figure 7.8 represents the
method of Key Extraction.

Respective x and y coordinate value is calculated
Figure 7.8 Snapshot Key Extraction

40

CHAPTER 8
SYSTEM TESTING
8.1

SYSTEM TESTING
System testing is the stage of implementation. It aims at testing and ensuring that the

system works accurately and efficiently before live operation commences. The logical design and
physical design should be thoroughly and continually examined on paper to ensure that they will
work when implemented. Thus the system in implementation should be a confirmation that all
system works. The testing phase includes entering the sample data to verify whether the system
is suitably working to the requirements mentioned. This phase is important in the way that it
actually deals with the real data.
Software testing is an important element of software quality assurance and ultimate
review of specification, design and coding. In testing, the engineer creates a series of test cases
that are indented to demolish the software that has been built.

8.2

OBJECTIVE OF TESTING
The rules that serve for testing are,


Testing is a process of executing a program of executing a program with the intent of




finding an error.
A good testing is the one that has the high probability of finding undiscovered errors.
A successful test is the one that uncovers a discovered error.

41



If testing is conducted successfully according to the above objectives, it will uncover the





errors in the software.
The various types of testing are:
Unit testing
Integration testing
Validation testing

8.3

UNIT TESTING
Unit testing is a procedure used to validate the individual unit of code. A unit is the

smallest testable part of an application. The goal of unit testing is to isolate each part of the
program and show that the individual parts are correct. A unit test provides a strict, written
contract .

8.4

INTEGRATION TESTING
Integration testing (sometimes called Integration and Testing, abbreviated I&T) is the

phase of software testing in which individual software modules are combined and tested as a
group. Integration testing takes as its modules that have been unit tested, group them in larger
aggregates, applies tests defined in an integration test plan to those aggregates, and delivers as its
output the integrated system ready for system testing.

8.5

VALIDATION TESTING
Validation testing provides the final assurance that the software meets all functional

behaviour and performance requirements. The software once validated must be combined with
other system elements. After each validation test cases has been conducted, two possible
conditions exist.
They are:




The function from specification characteristics confirms to specification and is accepted.
A deviation from specification is uncovered and a deficiency list is created.
The deviation or error discovered at this stage in a project can rarely be corrected prior to
schedule completion, it is necessary to negotiate with the customer to establish a method
for resolving deficiencies.

42

8.6

INTRODUCTION TO WINRUNNER
Win Runner facilitates easy test creation by recording how work on applications work as

you point and click GUI objects in your applications. You can generate a test script in the C-like
Test Script Language (TSL). You can further enhance your test script with manual programming.
Win Runner includes the function generator which helps you to quickly and easily add functions
to your recorded test.

CHAPTER 9
CONCLUSION AND FUTURE ENHANCEMENT
9.1

CONCLUSION
Results showed a very reliable and practical system. The proposed system is less cost due

to its software centric mechanism rather a hardware centric mechanism. Performance of the
system had been tested over Personal Computer. The data set involved in the development of the
system can be easily altered as user requests. Standard data set style is implemented. Response
time for Key Extraction is less quite compared to system like Finger-Joint Gesture Wearable
Keypad, Thumbcode, etc.

9.2

FUTURE ENHANCEMENT
Failure rate of the system entirely depends on the light intensity. The system works on

dim light also but failure rate is high due to shadowing effects. Future work relies on high
efficiency of the system with rid of light intensity.

43

10.1 SCREEN SHOTS

44

Figure 10.1 Screen Shot for Choosing the Environment
INPUT

OUTPUT

45

INPUT

OUTPUT

46

47

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close