Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
APPARATUS AND METHOD FOR ITERATIVE OPERATION ON REGULAR GRAPH
Document Type and Number:
WIPO Patent Application WO/2009/057967
Kind Code:
A3
Abstract:
In an iterative operation apparatus and method for a regular graph, a current group is defined in a specific direction in a 3-dimensional graph in which a hidden variable cost and a feature cost are stacked for an iterative operation on input data, and an operation initialization is performed by setting the number of iterations. Then, the iterative operation is sequentially performed on the hidden variable cost and the feature cost for each node in the current group up to the set number of iterations. Thereafter, a hidden state of the current group is estimated when the iterative operation for each node in the current group is completed. Then, the operation initialization, the iterative operation, and estimation of the hidden state are performed iteratively up to a position of a final current group.

Inventors:
PARK SUNG CHAN (KR)
JEONG HONG (KR)
NA INTAE (KR)
Application Number:
PCT/KR2008/006427
Publication Date:
July 16, 2009
Filing Date:
October 30, 2008
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
POSTECK ACADEMY INDUSTRY FOUND (KR)
PARK SUNG CHAN (KR)
JEONG HONG (KR)
NA INTAE (KR)
International Classes:
G06F5/00
Foreign References:
US6088035A2000-07-11
US5742814A1998-04-21
US20050138045A12005-06-23
US20060080645A12006-04-13
Attorney, Agent or Firm:
JANG, Seong Ku (Trust Tower 275-7,Yangjae-dong, Seocho-gu, Seoul 137-130, KR)
Download PDF:
Claims:

Claims

[1] An iterative operation apparatus for a regular graph, comprising: a feature cost calculation unit configured to calculate a feature cost for input data at each node in the regular graph; an iterative operation unit configured to carry out an iterative operation based on a hidden variable cost and the feature cost for the input data at each node in the regular graph, thereby producing an iterative operation result; and a result calculation unit configured to estimate a hidden state for the input data according to the iterative operation result, wherein a current group is defined by sectoring nodes in a 3-dimensional graph along a specific direction, the nodes corresponding to the input data, and the

3-dimensional graph being formed by stacking the hidden variable cost and the feature cost for which the iterative operation has been performed, and wherein the iterative operation unit sequentially performs the iterative operation with respect to the hidden variable cost and the feature costfor each node in the current group.

[2] The apparatus of claim 1, wherein the iterative operation unit calculates thehidden variable cost and the feature cost at the current position by using a previous set of the hidden variable cost and the feature cost for each node in the current group.

[3] The apparatus of claim 2, wherein the iterative operation unit calculates the hidden variable cost at the current position by using the previous set of the hidden variable cost and the feature cost stored in a layer buffer and a local buffer.

[4] The apparatus of claim 3, wherein theiterative operation unit updates the previous feature cost at a position in aprevious operation step stored in the layer buffer to the feature cost at the current position.

[5] The apparatus of claim 3 or 4, wherein the feature cost calculation unit initializes the operation for each bottom-level node contained in the current group.

[6] The apparatus of claim 5, wherein the operation is initialized by calculating the hidden variable cost and the feature cost for said each bottom-level node based on the input data or using the previous set of the hidden variable cost and the feature cost stored in the previous operation step.

[7] The apparatus of claim 5, wherein the iterative operation unit calculates the hidden variable cost at the current position by using the previous set of the hidden variable cost and the feature cost stored in the layer buffer and the local buffer following a sequence of a depth-first operation.

[8] The apparatus of claim 7, wherein the sequence of the depth-first operation is set such that a specified direction takes priority at a branching point in a tree structure formed by operation nodes exponentially increasing with an exponent of two along an axis in the 3-dimensional graph.

[9] The apparatus of claim 7, wherein the iterative operation unit calculates the hidden variable cost and the feature cost such that updatingdirections of the layer buffer and the local buffer are shifted in the proceeding direction of the operation.

[10] The apparatus of claim 9, wherein the result calculation unit estimates the hidden state according to a result calculated by using a result calculation function after the iterative operation for the hidden variable cost and the feature cost for the current group is completed.

[11] An iterative operation method for a regular graph, comprising: defining a current group in a specific direction in a 3-dimensional graph in which a hidden variable cost and a feature cost are stacked for an iterative operation on input data, and performing an operation initialization by setting the number of iterations; sequentially performing the iterative operation on the hidden variable cost and the feature cost foreach node in the current group up to the set number of iterations; estimating a hidden state of the current group when the iterative operation for each node in the current group is completed; and iteratively performing the operation initialization, the iterative operation, and estimation of the hidden state up to a position of a final current group.

[12] The method of claim 11, wherein the hidden variable cost and the feature cost are estimated at a current position by using a previous hidden variable cost and a previous feature cost for each node in the current group.

[13] The method of claim 12, wherein the hidden variable cost is calculated at the current position by using the previous the hidden variable cost and the previous feature cost stored in a layer buffer and a local buffer.

[14] The method of claim 13, wherein the previous feature cost at a position in a previous operation step stored in the layer buffer is updated to the feature cost at the current position.

[15] The method of claim 13 or 14, wherein the operation initialization is performed by calculating the hidden variable cost and the feature cost for bottom- level nodes based on the input data according to the number of iterations for the current group, or by using the previous hidden variable cost and the previous feature cost stored in the previous operation step.

[16] The method of claim 15, wherein the iterative operations are carried out following the sequence of the depth-first operation in which a specified direction takes priority at a branching point in a tree structure formed by operation nodes exponentially increasing with an exponent of two along an axis in a 3 -dimensional graph in which the current group is stacked.

[17] The method of claim 16, wherein the hidden variable cost and the feature cost are calculated in a manner that updating directions of the layer buffer and the local buffer are shifted in the proceeding direction of the operation.

[18] The method of claim 17, the hidden state is estimated according to a result calculated by using a result calculation function afterthe iterative operation for the hidden variable cost and the feature cost for the current group is completed.

[19] The method of claim 18, further comprising: outputting a processed result of the hidden state estimated in correspondence to the current group after carrying out the iterative operation up to the final current group.

Description:

Description

APPARATUS AND METHOD FOR ITERATIVE OPERATION

ON REGULAR GRAPH

Technical Field

[1] The present invention relates to an apparatus and a method for iterative operation on a regular graph; and, more particularly, to an apparatus and a method for iterative operation on a regular graph capable of performing iterative operation for processing a digital image using image data input to an image processing system.

[2]

Background Art

[3] Iterative operation schemes applied to a regular graph, which enable estimation of hidden states of nodes by allowing adjacent nodes to repeatedly transmit and receive information, are widely used in various fields such as communication systems and image processing systems.

[4] In particular, a typical iterative operation scheme applied to a regular graph includes the steps of: respectively allocating hidden states to be estimated to nodes modeling the relationships between the nodes and their adjacent nodes performing iterative operation processes using belief propagation (BP), accelerated belief propagation (ABP), hierarchical belief propagation (HBP) and stochastic diffusion (SD) and estimating the hidden states using hidden variable values obtained through the above-mentioned procedures.

[5] FIG. 1 is a view illustrating a structure of a first-order regular graph to which the conventional iterative operation scheme is applied and FIGS. 2 and 3 are viewsil- lustrating a layer and buffer structure to which the conventional iterative operation scheme is applied based on the first-order regular graph.

[6] In the following, the conventional iterative operation scheme will be described with reference to FIGS. 1 to 3. If all nodes on a graph are connected to the samenumber of adjacent nodes, the graph can be depicted by a two dimensional (2D) first-order regular graph as shown in FIG. 1. Here, iterative operation on a graph of M by N is carried out as follows, wherein a 2D vector is represented by

using *o and

, and the position of a node is represented by a 2D vector of P = [Po,P\]

[7] First, if the number of iterations on a 2D regular graph of FIG. 1 is 1, and a feature cost and a hidden variable ina set

that is a set of nodes adjacent to a node p are respectively defined as x [Ptl] and y [Ptl] , update sequences of the feature cost and the hidden variable for every repetition can be expressed as in Formula 1: [8] [9] Formula 1

[10] x tp,l] = x tp,l-l]

[ H] y ipM =f(y [N(PU-I] > χ ip,i-i] )

[12]

[13] An update sequence on a 2D first-order regular graph based on Formula 1 is schematically illustrated in FIG. 4. Theiterative operation scheme applied thereto can be represented by a layer structure, and a third coordinate corresponding thereto may be added to the existing 2D first-order regular graph. In this case, it can be illustrated as in the 3D structure of FIGS. 2 and 3.

[14] In this case, the iterative operation as shown in FIGS. 2 and 3 is of a filter structure for updating the next hidden variable based on the value of the hidden variable and feature cost in the preceding iteration. Hereinafter, such a structure is defined as an iteration filter structure, and a memory structure storing the preceding operation result to be used in an update sequence is denoted as a layer buffer.

[15] After iterating theabove operation L times, the final state

0 ( P) can be expressed by Formula 2 using a hidden variable value and a feature cost of each node p and a result calculation function

[16]

[17] Formula 2

[18] o(p) = q(y mp) ,L-i], χ [P ,L-U )

[19]

[20] If the number of iterative operations is fixed, the memory complexity of an existing iterative operation structure has a value of

0(C(MN)) for a graph of m by n, wherein C is a constant value determined by the number of iterations and the number of data bits of hidden variable.

[21] Such an iterative operation structure can be represented in the form of a 3D graph shown in FIG. 2 or 3, whereinhidden variables are stacked in a layer structure at every iteration for each node. In the 3D graph, time-wise iterations are converted into a spatial layer structure, in which the 1-th iteration is represented by an 1-th layer. In this manner, a layer buffer is shifted in the direction in which 1 increases asthe iterative operation proceeds, updating the hidden variable of the next layer's node.

[22] If, however, the conventional iterative operation scheme of the above is applied to a typical low-level vision system or digital image processing system, it is necessary to allocate nodes to approximately more than 70,000 pixels per a single image. Thus, a large amount of operations are required to estimate states such as motion vector, disparity, and label, and a largememory capacity is required to store hidden vari- ablevalues for all nodes. Therefore, bandwidth problems may occur in terms of memory access, thereby making it difficult to implement a high-speed system using parallel operations.

[23] In other words, since a plurality of processors should access simultaneously to a layer buffer of large size corresponding to the number of nodes on a 2D graph, the bandwidth of memory bus becomes large, making it difficult to realize a parallel system.

[24]

Disclosure of Invention

Technical Problem

[25] In view of the above, the present invention provides an apparatus and method for iterative operation on a regular graph, capable of performing the following: representing cost for input data on a 3D graph by stacking the cost when performing iterative operation thereon; and carrying out the iterative operation in a parallel manner by sequentially processing the cost for each node in a current group defined along a specific direction.

[26] Further, the present invention also provides an apparatus and method for iterative operation on a regular graph, capable of performing the iterative operation through cost update using a layer buffer and a local buffer based on hidden variable cost and feature cost for eachnode in a current group defined along a specific axis direction.

[27]

Technical Solution

[28] In accordance with an aspect of the present invention, there is provided an iterative operation apparatus for a regular graph, including:

[29] a feature cost calculation unit configured to calculate a feature cost for input data at each node in the regular graph;

[30] an iterative operation unit configured to carry out an iterative operation based on a hidden variable cost and the feature cost for the input data at each node in the regular graph, thereby producing an iterative operation result; and

[31] a result calculation unit configured to estimate a hidden state for the input data according to the iterative operation result.

[32] Herein, a current group is defined by sectoring nodes in a 3-dimensional graph along a specific direction, the nodes corresponding to the input data, and the 3-dimensional graph being formed by stacking the hidden variable cost and the feature cost for which the iterative operation has been performed. Further, the iterative operation unit sequentially performs the iterative operation with respect to the hidden variable cost and the feature cost for each node in the current group.

[33] In accordance with another aspect of the present invention, there is provided an iterative operation method for a regular graph, including:

[34] defining a current group in a specific direction in a 3-dimensional graph in which hidden variable costs and feature costs are stacked for iterative operations on input data, and performing an operation initialization by setting the number of iterations;

[35] sequentiallyperforming the iterative operations on the hidden variable costs and the feature costs for respective nodes in the current group up to the set number of iterations;

[36] estimating a hidden state of the current group when the iterative operations for the respective nodes in the current group are completed; and

[37] iteratively performing the operation initialization, the iterative operations, and estimation of the hidden state up to a position of a final current group.

Advantageous Effects

[38] In accordance with the aspects of the present invention, it is possible to perform, if the number of iterations (which is denoted by L) is sufficiently smaller than N in a graph of M by N, efficient operation M times quicker than the conventional operation scheme to thereby realize a high-speed real-time parallel operation structure by using M parallel processors and a memory whose capacity is L/N times as large as that in the conventional iterative operation scheme.

[39] In case of a typical cut-based segmentation, a sufficiently satisfactory result can be obtained by only 10 iterations with respect to an image whose resolution is 680 by 480.

Therefore, in case of not using a parallel structure, the same performance can be achieved only using a memory whose capacity is 11/480 times as large as that in the conventional scheme. Further, in case of using a parallel structure having 640 parallel processors, it is possible to perform iterative operation 640 times quicker than the conventional one. The scheme of the present invention is better in memory performance and easier to parallelize than the conventional scheme, and enables a real-time system using a VLSI having a distributed memory.

[40] Further, the iterative operation scheme of the present invention may be applied to various signal processing fields that can be modeled by a 2D first-order regular graph, let alone the field of image signal processing. Furthermore, the utility of the present invention is also sufficient for realizing a low -power consumption system with a low price and small size.

[41]

Brief Description of the Drawings

[42] The above and features of the present invention will become apparent from the following description of embodiments given in conjunction with the accompanying drawings, in which:

[43] FIG. 1 is a view illustrating a structure of a first-order regular graph to which an iterative operation scheme is applied according to the prior art;

[44] FIGS. 2 and 3 are views illustrating a layer and buffer structure to which an iterative operation method is applied through a first-order regular graph according to the prior art;

[45] FIG. 4 is a view illustrating an update sequence on a 2D first-order regular graph according to the prior art;

[46] FIGS. 5 to 7 are views illustrating an iterative operation structure based on an iteration filter according to an embodiment of the present invention;

[47] FIGS. 8 and 9 are views illustrating an iterative operation structure after transformation of node coordinates according to the embodiment of the present invention;

[48] FIGS. 10 and 11 are views illustrating update sequences of hidden variable and feature cost before and after transformation of node coordinates, respectively

[49] FIG. 12 is a block diagram illustrating an iterative operation apparatus based on an iterative operation structure according to theembodiment of the present invention;

[50] FIG. 13 is a view illustrating a systolic array structure of a parallel processor for processing iterative operation according to the embodiment of the present invention;

[51] FIG. 14 is a flow chart illustrating a process of iterative calculation in a parallel processor structure using an iterative operation apparatus including a layer buffer and a local buffer according to the embodiment of the present invention;

[52] FIG. 15 is a flow chart illustrating a process of updating a layer buffer and a local buffer according to the embodiment of the presentinvention;

[53] FIGS. 16 and 17 are viewsschematically illustrating a hierarchical 2D graph structure according to the embodiment of the present invention;

[54] FIGS. 18 and 19 are viewsillustrating an iterative operation structure on a 3D graph reflecting a hierarchical iterative structure according to the embodiment of the present invention;

[55] FIG. 20 is a view illustrating a depth-first operation order in a system of

K = 2 according to the embodiment of the present invention; and

[56] FIGS. 21 to 23 are views for explaining the improvement in an image matching result by a real-time stereo matching system based on theiterative operation structure according to the embodiment of the present invention.

[57]

Best Mode for Carrying Out the Invention

[58] In accordance with an embodiment of the present invention to be described in detail, a current group for which hidden variable cost and feature cost are to be stacked is set in a specific direction for performing iterative operation on input data, and the iterative operation is initialized by setting the number of iterations. Then, the iterative operation for hidden variable cost and feature cost of each node in the current group are sequentially carried out up to the number of iterations. Further, hidden state of the current group isestimated when the iterative operationfor each node in the current group has been completed, and the initialization, the iterative operation, and the hidden state estimation are repeated until reaching the final current group. In this manner, the drawbacks of the prior art can be overcome.

[59] Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

[60] FIGS. 5 to 7 are views illustrating an iterative operation structure based on an iteration filter according to the embodiment of the present invention. FIGS. 8 and 9 are views illustrating the iterative operation structure after transformation of node coordinates according to the embodiment of the present invention. FIGS. 10 and 11 are views illustrating update sequences of hidden variable and feature cost before and after transformation of node coordinates, respectively.

[61] Referring to FIGS. 4 to 6, the iterative operation scheme according to the present invention will be described. If the number of iterations (hereinafter denoted by I

) and a node coordinate

P = [A), A] satisfy conditions of

0 ≤ l < L - \

ϋ ≤ p o ≤ N - l and

0 ≤ P1 < M - 1

, let us define

Q(Pv - U) as a linear node set of the lengths M located at the

-th position in the i

-th layer in a 3D graph. [62] Then, a new node set can be defined by collecting

Q(P 0 -U) with respect to the total number of iterations (hereinafter denoted by L

). This new node set is an oblique planar set of M by N that starts at (2(A 1 , 0) and ends at Q(P 0 - (L - I), Z - I)

. Let us define this set as Q(P 0 )

, another iterative operation structure having a different layer buffer structure may be created by setting the direction of iterative operation on the plane. In this case, the current group as in FIGS. 5 to 7 refers to a set to which operations are carried out and each node refers to a pixel contained in image data.

[63] In the iterative operation structure based on the iteration filter as in FIGS. 5 to 7 that is to be compared with the conventional iterative operation structure, the previous nodes having the previous hidden variable costs and the previous feature costs that are necessary to update the hidden variable cost of

(2(A,, 0) in the iterative operation structure pertain to a set among

Q(P 0 - 2 - (I - I)J - I)

Q(P 0 -I-Q- \)J -1) and

Q(P 0 - g-i\i-D

, and the linear node sets pertain to

Q(Po - 2)

Q(p 0 - 1) and

Q(P 0 )

, respectively. FIGS.5 to 7 depict an example of carrying out iterative operation on a set

in which P 0 =5.

[64] Memories used in this iterative operation structure may be classified into two types.

In specific, a first memory for storing values of two planar node sets

Q(P 0 - 1) and Q(P 0 ~ 2) whose operations are completed is necessary for carrying out iterative operation from

to

Q(P 0 - (L- I) 1 Z-I) along the set Q(P 0 )

, and a second memory for storing the result of the immediately preceding iterative operation (i.e.,

Q(P 0 -Q-IH-I)

) is necessary for updating a node Q ( P 0 -U) in the set

Q(P 0 )

. Herein, let us define the first memory as a layer buffer and the second memory as a local buffer. [65] The layer buffer (i.e., the first memory) and the local buffer (i.e., the second memory) carry out respective updating and buffering processes as iterative operation proceeds. When carrying out operation for the set

Q ( P 0 - U) on Q(P 0 )

, the hidden variable cost and the previous feature cost that are necessary for calculating y Q(po-i,i) are fetched from the buffers. Herein, the value of y β(p o-(i-i),ι-i)is in Q(Po) and thus is stored in the local buffer, and the values of y Q( P O-I-(I-I),I-I) and y Q( P O-2-(I-I),I-I> are values in

Q(p 0 - 1) and Q(P 0 ~ 2)

, respectively, and thus are stored in the layer buffer. [66] Further, the feature cost of

Q ( P 0 - L D canbe obtained from

QQ? 0 - (l - l\l - l)

, which is the node located at the sameposition in the previous iteration step. In other words, the nodes that have the value of x β(p0 .ι,i) =x Q(po-ι,ι-i> pertain to Q(P 0 ~ 1)

, so that the feature cost can be obtained from the layer buffer. [67] Here, if the set of adjacent nodes is defined as

N(Q(p ϋ - (l - \)J - Vj)

, a new hidden variable cost of Q(P 0 - LO canbe calculated as in Formula 3 by applying a function of f(*) to the previous hidden variable cost and the previous feature cost that are acquired from the two buffers. [68] [69] Formula 3

[70] y Q(pO-l,l) =f(y N(Q(p0-l,l-l) ,X Q(pO-l.l-l) )

[71]

[72] In particular, after the hidden variable costs are updated with respect to the entire nodes in

Q(P 0 - U)

, a hidden variable cost of y β(p o-(i-i),ι-i) and a hidden feature cost of X Q ( PO -O- I ) ,I - I ) that were previously stored in the local buffer are stored in a layer buffer at a portion corresponding to

Q(p o - l - Q - \)J - \) and, at the same time, the newly calculated hidden variable cost is stored in a local buffer. Further, the data y Q(p0 -2-a-i),ι-i) > y Qtpo-i-a-w-D and x QφO-I-O-W-D in the (l-l)-th layer in the layer bulffer are updated as in Formula 4:

[73]

[74] Formula 4

[75] y Q(pO-2-(l-l),l-l) = y Q(pO-l-(l-l),l-l)

[76] y Q(po-i-(i-i),i-i> = y Q(po-(i-i),i-i>

[77] X Q(P 0 -I-(I-I) 1 I-I) = X Q(pO-(l-l),l-l)

[78]

[79] Herein, Formula 4 represents a shift of data in a layer buffer during iterative operation, and it can be seen in FIG. 6 that the layer buffer is shifted by one in the direction of

P 0 when a hidden variable cost is updated through one iterative operation. [80] Accordingly, when the operation is completed for the entireiterative layers in

Q(P 0 ) , the layer buffer that was located in the area including the nodes of

Q(P 0 - 2) and

Q(P* - 1) is moved to

Q(P 0 - 1) and Q(P 0 )

. Then, an operation for a new current group of Q(P 0 + 1) can be carried out using the moved layer buffer and, likewise, iterative operation can

be carried out for the entire nodes in the graph.

[81] Meanwhile, a local buffer initializes data whenever the current group is changed to a new one. Further, whenever a hidden variable cost of a new node is updated through iterative operation, existing buffer data is sent to and stored in the layer buffer, and the newly calculated hidden variable cost is stored. The position of the local buffer is moved one by one along the direction of the current group whenever iterative operation moves on.

[82] In the present embodiment, an access address of a buffer is defined in the form of vector to simplify the representation of the above-mentioned iterative operation. In specific, a buffer vector is defined as [a. P 1 .1] , where a refers to the position of a buffer and is an integer satisfying a condition of

- 2 < o < 0

. Further,

Pi and i are addresses for accessing buffer data and are identical to those used in the 3D graph. [83] The buffer vector means an address of the layer buffer if α = - 1 or a = - 2

, and an address of the local buffer if α = Q

. If the third index of a buffer vector is

I for all buffers, data of that address is used for the /+i-th operation and the coordinate position represented by

[C P 1 J] is

[p o +α - I p 1 J] on the actual 3D graph.

[84] During use of such a buffer vector, a coordinate system is used in which a 2D graph of each layer is moved by

as the number of iterations in the direction of

- Po as shown in FIGS. 8 and 9 that depict the operation flow and the movement of each buffer. It can be seen from this coordinate system that the local buffer is vertically moved whenever iterative operation proceeds and the layer buffer is horizontally moved whenever an operation for a single operation group is completed. Considering the above transformed coordinate system and buffer vectors, the sequence of the iterative operation can be, by using a pseudo-code, expressed in the form of the algorithm shown below: [85]

<New iteration sequence> for node index ^ 0 =O to N-I do Initialization for each parallel prcessor ^ 1 =O to M-I do

end for

Iteration for each layer ι=\ to L do for each parallel prcessor ^ 1 =O to M-I do Update local buffer Update layer buffer end for end for

Calculate output end for Subroutine 1 Update local buffer P 1 M])-* [-1 P 1 M]) X[0 P 1 ϊ\ =x \ 1 P 1 M]

Subroutine 2 Update layer buffer

3>[-2 /-, M]= 7 ^C-I P 1 /-1] ^[-1 P, M]=^[O *, M] χ[-\ P 1 ι~i] [o P 1 /- ] ]

Subroutine 3 Calculate output for each parallel prcessor ^ 1 =O to M-I do

°[0 P 1 L+\} =C ^y Hl-I P 1 Lh x [-\ P 1 L]) end for

[86]

[87] In case of image-processing based on the iterative operation structure of the above, cost (such as hidden variable cost and feature cost) for each pixel according to stepwise operation result is sequentially stacked to represent a 3D iterative operation graph while iterative operation are being performed for outputting the cost for each pixel. Further, iterative operation is performed for hidden variable cost and feature cost for each node along a specific direction (e.g., the direction of diagonal line) with reference

to a specific node in the 3D iterative operation graph, thereby producing iterative operation result to enablethe image processing.

[88] Herein, a planar node set (current group) inclined in a specific direction may be defined on the 3D iterative operation graph sectored into two or more sets, and iterative operation may be performed thereon sequentially along the specific direction. Then, the final result is calculated through procedures includinginitialization and operation for each node set (current group) defined in eachoperation step on the hierarchical 3D iterative operation graph including the sectorednode sets.

[89] The initialization of the above may be carried out by calculating hidden variable cost and feature cost based on input image data in the lowest level node storing them in the buffers including the local buffer and the layer buffer and, from the next level node,using the previous hidden variable cost and feature cost stored in the buffers in each step. Further, the iterative operationof the above may be performed by extracting the previous feature cost and hidden variable cost from the local buffer and the layer buffer by following the sequence of the depth-first operation; calculating a new hidden variable cost therefrom; and storing the new hidden variable cost in the local buffer and the layer buffer to update the hidden variable cost.

[90] FIG. 12 is a block diagram illustrating an iterative operation apparatus based on the iterative operation structure according to the present embodiment. As shown therein, the iterative operation apparatus includes a feature costcalculation unit 802, an iterative operation unit 804 and a result calculation unit 806.

[91] The feature cost calculation unit 802 receives data such as an image to calculate feature cost for each node in operation groups, and initializes iterative operation structure based on the calculated feature cost. Further, theiterative operation unit 804 performs iterative operation using an iterative operation module to thereby calculate hidden variable costs for M nodes and updates buffers forthe entire number of iterations (L times) in a parallel manner through the parallel processor and distributed memory structure as illustrated in FIG. 13.

[92] After the iterative operation is completed, the result calculation unit 806 calculates and outputs a final result for each node using a result calculation module. More specifically, after the iterative operation is repeated for a total of L times, a final state o ( ..p) can be calculated by Formula 2 of the above using the hidden variable cost and the feature cost for each node p and the result calculation function

<7<X /)

[93] When this image processing system processes an image of M by N, iterative operation iscarried out for a total of L times. If the data sizes of the hidden variable cost and the feature cost are B bits, the overall size of the local buffer isequal to BM

, and the overall size of the layer buffer is equal to A BLM

[94] Further, if the complexity of a memory is expressed using 'Big Oh' representing the time complexity of an algorithm, the local buffer has a complexity of

and the layer buffer has a complexity of

. Accordingly, the memory complexity of the entire system amounts to £((21 + I)AO

[95] The iterative operation structure of the above is a schemefor obtaining hidden variable cost for each node by reflecting information of the entirenodes of a 2D graph through iterative operation between adjacent nodes. Therefore, more iterations are needed to collect information from a remote location, as the size of the graph becomes larger. Further, the number of iterations for a stable result is not a constant. Considering this, a hierarchical iteration structure may be used to implement a system capable of obtaining consistent results within a fixed number of iterations smaller than that in the conventional case.

[96] FIG. 14 is a flow chart illustrating a process of iterative calculation in theparallel processor structure using the iterative operation apparatus including the layer buffer and the local buffer according to the embodiment of the present invention.

[97] As shown therein, the iterative operation unit 804 is initialized in step 1002 for processing an image by the image processing system. Then, in step 1004, data such as an image is input to a feature cost calculation unit802. Thereafter, in step 1006, the feature cost calculation unit 802 sets the position of a current group (operation group), i.e., a linear node set

Q(Po) for the input data to be 0.

[98] The feature cost calculation unit 802 sets the number of iterations (which is denoted by

) to be 1 in step 1008, and calculatesthe feature cost for each node in the current group in step 1010. Then, in step 1012, the feature cost calculation unit 802 initializes the iterative operation structure using the feature cost calculated for each node. [99] Then, in step 1014, the iterative operation unit 804 updates the layer buffer and the local buffer in step 1016 by carrying out parallel operation on the feature cost and the hidden variable cost for each node in the current group. [100] Further, in step 1018, the iterative operation unit 804 carries out parallel operation on the feature cost and the variable cost of each node to check whether or not the number of iterations (which is denoted by i

) reaches the final number of iterations (which is denoted by L

) while the layer buffer and the local buffer are being updated. Hereinafter, an update sequence of the layer buffer and the local buffer by parallel operation for feature cost and hidden variable cost will be described in detail with reference to FIG. 15. [101] If the check result in step 1018 indicates that the number of iterations has not reached the final number of iterations, the iterative operation unit 804 continues to carry out steps 1014 and 1016 for updating the layer buffer and thelocal buffer by carrying out parallel operation for the feature cost and the hidden variable cost in the current group up to the final number of iterations. Then, if the number of iterations has reached the final number of iterations (that is, all iterative operations on the current group p 0 = k have been completed), the result calculation unit 806 estimates in step 1020 a hidden state for the current group using the hidden variable cost and the feature cost. [102] Then, in step 1022, the result calculation unit 806 checks whether or not the position

( p

) of the current group is identical to the position (

N

) of the final current group. In other words, the result calculation unit806 checks whether or not the current group whose hidden state is estimated after carrying out the current iterative operation is the final current group.

[103] If the check result in step 1022 indicates that the position of the current group is not equal to that of the final current group, theposition of the current group in step 1006 is increased by one by the feature cost calculation unit 802, and steps 1008 to 1020 are repeated for the current group up to theposition of the final current group. Then, if the position of the current group becomes equal to that of the final current group, the result

calculation unit 806 finishes the estimation of hidden states of image scan lines (node lines), and outputs the data processing result in step 1024. [104] FIG. 15 is a flow chart illustrating the process of updating the layer buffer and the local buffer performed in step 1016 of FIG. 14.

[105] As shown therein, after the iterative operation apparatus enters a buffer update mode of step 1102, the update sequences are carried out separately (i.e., in a parallel manner) when the iterative operation unit 804 performs the iterative operation using the layer buffer or the local buffer. Specifically, it is checked whether the update involves the layer buffer or the local buffer in step 1104, and, if the check result indicates the update sequence using the layer buffer, the process proceeds to step 1106. In step 1106, the hidden variable costs corresponding to the positions [-2.P .1 - 1] and [-I ,;?,? - 1] of the layer buffer are updated to those corresponding to the positions [- 1.P .1 - 1] and

of the layer buffer. Herein, step 1106 may be described by Formula 5: [106] [107] Formula 5

[ 108] y [O.p.l] = f(y N[-l,p,l-l] ,X [-l,p,l-l] )

[109]

[110] Then, in step 1108, the feature cost corresponding to the position [-i ,;>,J - i] in the layer buffer is updated to that corresponding to the position

[0 ,^ ,J - I] in the local buffer. Here, step 1108 may be described by Formula 6: [111]

[112] Formula 6 [ 1 13] X [o,p,u = X [-i,p,i-i] [114] [115] Meanwhile, if the check result indicates the update sequence using the local buffer, the process proceeds to step 1110, so that the hidden variable cost corresponding to the position

[0 , P , l] is updated to a new one calculated from thehidden variable cost and feature cost in

the previous step in the parallel processor. Here, step 1110 may be described by

Formula 7: [116]

[117] Formula 7 [118] y [-2, P J-U = J 1 I-Ip 1 I-U [119] y [-i,p,i-u = y [o.p.i-11 [120] [121] Then, in step 1112, the feature cost corresponding to the position

of the local buffer is updated to thatcorresponding to the position

[-1 ,JU - I] in the layer buffer. Here, step 1112 may be expressed by Formula 8:

[122]

[123] Formula 8

[124] x [-ipj-i] = x [o,p,i-i]

[125]

[126] FIGS. 16 and 17 are conceptual views illustrating thehierarchical 2D graph structure according to the embodiment of the present invention. As a hierarchical level increases, the number of nodes inthe graph gradually decreases, and the nodes contain more collective information. The nodes are under-sampled at the ratio of a half per each level-up, and the overall size of the graph is as large as a quarter of that in the previous level. That is, when there exist a total of K hierarchical levels, the number of nodes in the top level K- 1 becomes

in case of the graph of M by N. [127] Following a coarse-to-fine operation sequence, the hierarchical iterative operation starts at the smallest graph of the top level and ends at the densest graph of the bottom level, which isillustrated as in FIG. 18 depicted as a 3D graph. [128] Referring to FIGS. 18 and 19, the iterative operation structure will be described.

When a single iterative operation is carried out between nodes in step k + i

, a shift effect that amounts to when two operations are carried out in step k is observed, and it can be seen that the amount of the layer shift by iteration varies with the hierarchical level.

[129] Therefore, it becomes not proper for the current group to be defined by the above- mentioned set

Q(P 0 )

. Accordingly, it is necessary to newly define node coordinate and node set according to the hierarchical level. In specific, if

I 1 , and

are respectively defined as node coordinate, the number of iterations and a total amount of layer shifts, and node sets can be redefined based thereon as in Formula 9:

[130]

[131] Formula 9

[132] QCjy k o-hi k ^ = { Vp k o-hp \,ι k λ I o p \ A4/2. k - \ λ }

β(p o -2i: ^ 1 ) = { βCp o -/,/ i ) I 1 /* Z, *] }

^ = Ar — Ar+ 1 1 = 1 Ar+ ξ JC- 1 = JC- I -.

[133]

[134] where

is a total aof layer shifts generated by iterative operations in steps that come before the step k

, and, by multiplying

by 2, an exact total sum of layer shifts in view of step k can be calculated.

[135] During the iterative operation, since the number of nodes is increased twice between adjacent steps, the current group expressed in a singleplanar node set at the top level contains information about a total of

planes at the bottom level. FIGS. 18 and 19 represent operations in a hierarchical graph structure of

K = 2

. Specifically, after the operation group of the level k + \

undergoes

number of operations, the operation of the next level k is started. Since one of the k + \ groups contains information corresponding to two of k groups,

number of operations need to be carried out twice sequentially at the level k

[136] Thesesequential operations do not change the operation structure using the local buffer and the layer buffer as in FIGS. 18 and 19. In other words, in the same level, iterative operation can be carried out without changing the sizes and structures of the local buffer and the layer buffer.

[137] However, since data of the previous-level nodes are needed before iteration is started at each step, the buffer update sequence may become complicated. To avoid this, operation is carried out in a tree- structured graph for the reason of enhancing operation efficiency while using local buffer of a small capacity, and the iterative operation is performed following the sequence of depth-first operation. Herein, the depth-first operation proceeds such that, at a branching point, thefiner level (i.e., the one that requires the smaller memory) takes priority.

[138] FIG. 20 is a view illustrating the sequence of depth-first operation in a system of K = 2 according to the embodiment of the present invention. The memory complexity of the hierarchical operation structure is equal to

K- 1 A=O

[139] As discussed above, the present embodiment relates to a parallel iterative operation scheme havingan efficient memory structure for processing a digital image. Although existing iterative operation technologies have low error rates, they need a large memory and a long calculation time to estimate a result based on hidden variable costs for a large number of nodes in a 2D graph. In addition, if a system in which a number of parallel processors simultaneously access a memory of a large capacity is used to improve operation speed, the bandwidth of the data bus increases significantly.

[140] Regarding this, it is possible to remarkably reduce the size of the memory capacity required for the system by adopting a scheme in which iterative operation on the 2D first-order graph is represented using a 3D graph by adding thereto one or more iterative levels, and the 3D graph is sequentially processed in a parallel manner along a specified direction. Thus, a high-speed parallel image processing system can be realized in a manner that distributed memories of small capacities are used in a very large scale integrated circuit (VLSI).

[141] As described above, the algorithm in accordance with the present embodiment is suitable for high-speed parallel image processing based on iterative operation, and shows excellent performance in implementing a real-time image processing system using a small-sized VLSI chip. This algorithm is especially effective when the total number of iterations satisfies a condition of

2 L + 1 « N

[142] When I is close to

N

, this algorithm does not substantially differ from conventional ones in terms of memory resources, but the efficiency of this algorithm may be remarkably improved when the number of iterations is small. In image processing fields that adoptiteration scheme whose number of iterations is relatively small, this algorithm may enhance the efficiency of, e.g., segmentation scheme that has a rapid convergence speed. In such a case,a sufficiently stable result can be obtained for the entire image by about 10 iterations.

[143] Compared to the typical cut-based BP scheme, the present embodiment can reduce the memory capacity by 480/11=44 times in case of a 640-by-480 image, and can solve the problems of image processing schemes using iterative operation on a conventional 2D graph that has difficulty with realization of parallel operations due to its memory bandwidth.

[144] That is, the present embodiment enables high-speed real-time processing by parallel processors using distributed memories in a VLSI. Further, since the system of the present embodiment is configured with a combination of small operation units, it can easily be expanded so that the system design can be conducted with flexibility for coping with image inputs of various sizes required by the image processing system.

[145] FIGS. 21 to 23 are views for comparing image matching results between a conventional image matching system and a real-time stereo matching system based on the iterative operation structure according to the embodiment. Among them, Fig. 21 shows

an original picture, Fig. 22 shows a matching result of the conventional system, and Fig. 23 shows a matching result of the system based on the method ofin the embodiment. It can be seen therefrom that the error performances of the two systems are substantially same.

[146] Table 1 describesratios between memory complexities of iterative operation structures according to the prior art and the present embodiment in a 2D graph structure having a hierarchical structure.

[147] [148] Table 1 [Table 1] [Table ]

[149] [150] In the above table, the image sizes used in the respective systems are shown in the column of "Image size" the numbers (scale) of used hierarchical levels and the total numbers (iter.) of iterations for each level are shwon in the column of "Iteration time" and the ratios of the memory complexity of the hierarchical structure, which is equal to

Oi (2Z,*+ 1XA4/2 *X)

A=O

, to the memory complexity of other systems are shown in the column of 'Ratio.' It can be seen from the table that the memory complexity of the system according to the embodiment is 4 to 27 times lower than those of the conventional systems.

[151] While the invention has been shown and described with respect to the embodiments, it will be understood by those skilled in the art that thesystem and the method are only

examples of the presentinvention and various changes and modifications may be made without departing from the scope of the invention as defined in the following claims.