Detail

Publication date: 1 de June, 2021

Clip-art Retrieval using Sketches

Nowadays, there are large collections of clip-arts, in vector and raster format, available either on the Internet or on collections sold in optical media. Typically, such drawings tend to be archived and accessed by categories (e.g. food, shapes, etc.). Although people
find it useful to incorporate such figures in their documents (presentations, thesis, etc.), finding a particular drawing among
hundreds of thousands is not an easy task. Manually searching for them is usually slow and problematic, and solutions using
keywords or tagging are also impracticable since they have to be generated manually and they force users to know in detail the
meta-information used to characterize drawings. Moreover, textual description is not adequate to describe layout, shape and
topology [1], suffers from low term agreement across indexers [2] and also between indexers and user queries [3,4]. A more adequate solution must take into account information automatically extracted from clip-arts, instead of information manually generated by people.
Although there are several solutions for Content-Based Image Retrieval, they are more focused on photographic images and not on raster drawings, which have well-defined contours and lines. On the other hand, the majority of solutions for Drawing Analysis and Retrieval, cannot deal with complex drawings, such as clip-arts, supporting only the retrieval of simple drawings (see Figure 1 in Figures.pdf). Currently, there are technologies to address the retrieval of both types of clip-arts, raster and vector, but they are evolving separately, tackling different problems without taking advantage of a combined approach, merging the strengths of the two research areas to handle the clip-arts retrieval and browsing challenge.
Here, we want to develop a new approach to retrieve clip-arts, independently of their format, that will combine the potentialities
from both techniques. Our solution will allow the search and retrieval of clip-arts using sketches as queries. It will use techniques from image processing and from vector drawing analysis, to describe clip-art contents. From the vectorial part, we will explore the spatial arrangement of visual components that constitute the drawing, as well as techniques used to describe their shape. From the image processing side, we will take advantage of existing methods to visually simplify clip-arts, and to extract features related to color and texture to describe its content. This simplification task is fundamental, since we want to compare simple queries (specifiedusing sketches) with clip-art drawings, which are more complex (see Figure 2 in Figures.pdf). To make the comparison more
effective, we need to simplify clip-arts, while keeping them visually recognizable by users. There is a trade-off between
simplification and preservation of the meaning that we need to evaluate by performing human perception tests.
Another important aspect of the proposed solution is that we intend to support large databases of clip-arts (in the order of hundreds of thousands), while the majority of existing solutions only hold up hundreds of drawings. For that, we will invest in the
development of an indexing structure, which will accommodate the multidimensional feature vectors extracted from clip-arts, and will accelerate the searching and matching steps.
To achieve these goals we will start by analyzing the composition of clip-arts, to identify components and areas in which we can apply simplifications. After defining and validating several heuristics for simplification we will extract content information, using a vectorial and a raster representation. To deal with the resulting feature vectors we will develop a new multidimensional indexing
structure to speed up searching. After developing these foundation components we will combine them in a final prototype for
retrieving clip-arts. Additionally, we will include multimodal techniques to specify queries, which may combine sketches and speech, among other modalities, and we will develop new efficient mechanisms for browsing and exploring results or the entire collection of clip-arts. Finally, we plan to evaluate our solution by measuring precision and recall, and by performing tests with users.
Some of the results achieved, in a recent past, in these areas of research (drawing and image retrieval, multidimensional indexing,calligraphic interfaces, virtual environments) make us believe that we have the right competences and that we are proposing the right path to achieve the goals defined for this project. The combination of skills from both partners is an important added value to assure the successful completion of the project.

In summary, we plan to develop a solution for the retrieval of clip-arts from large collections using sketches as queries. We expect to have a novel approach, able to deal with raster and vector drawings in a unified way, taking advantage of the best techniques from both research areas.

Team

Nuno Correia, Rui Jesus,

Sname CRUSH
Funding Total 67000
Funding Center 27300
State Concluded
Startdate 01/03/2010
Enddate 28/02/2013