Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DYNAMICS AND CONTROL OF STATE-DEPENDENT NETWORKS FOR PROBING GENOMIC ORGANIZATION
Document Type and Number:
WIPO Patent Application WO/2013/040076
Kind Code:
A1
Abstract:
The present subject matter is directed to a computer-executable method for characterizing network controllability and further for characterizing influential nodes in a network via network entropy using one or more computing devices. The method is particularly useful for state-dependent networks, such as the genome of a cell, where chromosomal geometry directly relates to genomic activity, which is turn strongly correlates with geometry. Currently, there is no systematic methodology for assessing the controllability of the network from experimental measurements. This subject matter provides such an assessment and computer-executable analysis technique to examine network controllability and classify critical network configurations that are susceptible to external influence. The method provides a computer executable means to not only identify a receptive state in the cell, but also multiple novel targets in a pathological system simultaneously. The method is also useful and has applications for designing influence mechanisms, e.g., marketing strategies on social networks, for designing vaccination strategies to influence the progression of an infectious disease, and for designing robust robotic networks, e.g., those that are less controllable, i.e., with respect to external perturbations.

Inventors:
RAJAPAKSE INDIKA (US)
GROUDINE MARK (US)
MESBAHI MEHRAN (US)
Application Number:
PCT/US2012/054921
Publication Date:
March 21, 2013
Filing Date:
September 12, 2012
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HUTCHINSON FRED CANCER RES (US)
UNIV WASHINGTON CT COMMERCIALI (US)
RAJAPAKSE INDIKA (US)
GROUDINE MARK (US)
MESBAHI MEHRAN (US)
International Classes:
G06F19/00; G16B5/00
Domestic Patent References:
WO2009038248A12009-03-26
Foreign References:
US5615319A1997-03-25
EP1540505A22005-06-15
EP1764717A12007-03-21
Attorney, Agent or Firm:
POOR, Brian, W. (1420 Fifth Avenue #280, Seattle WA, US)
Download PDF:
Claims:
CLAIMS

The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows:

1. A computer-executable method, comprising:

representing a dynamic system as nodes being networked to zero or more nodes which together represent a network on a computing device;

analyzing by the computing device a diffusion dynamics over the network as a derivative of geometric state of the system over time, which is a first sum of two summands, the first summand being a product of two multiplicands and the second summand being a first product of an indication that an input matrix that might be state dependent and an input to the network, the second summand being a negative second product of two multiplicands, the first multiplicand being the geometric state of the system as a function of time, the second multiplicand being a second sum of two summands, the third summand of the second sum being a matrix Kronecker product of a weighted Laplacian of the network and an identity matrix, the fourth summand of the second sum being a diagonalizable matrix with a vector of the indication that the input matrix might be state dependent on its diagonal; and

determining a desirable behavior of the network by identifying entropic network configurations in accordance with the input.

2. The method of Claim 1, wherein the dynamic system is selected from a group consisting essentially of a cell, a disease spread within a network of interacting individuals, a social network, and a robotic network.

3. The method of Claim 1, wherein the desirable behavior is a therapeutic solution.

4. The method of Claim 1, wherein the entropic network configurations comprise controllable network configurations.

5. The method of Claim 1, wherein a weight on an interaction link between two nodes is inversely proportional to their distance indicating a stronger interaction between the nodes that are geometrically closer.

6. The method of Claim 1, wherein the entropic network configurations are disordered.

7. The method of Claim 1, wherein a weight on an interaction link between a a chromosome at one node and another chromosome at another node is inversely proportional to their distance indicating a stronger interaction between the chromosomes within a cell that are geometrically closer.

8. The method of Claim 1, further identifying master regulators that globally influence the fate of a cell.

9. The method of Claim 1, further selecting a location for injecting steering signals into the network so as to determine the controllability of the network.

10. The method of Claim 1, further selecting a location for injecting steering signals into the network to steer cell organization to a desired network equilibria.

11. The method of Claim 1, wherein the desirable behavior is a therapeutic intervention for cell processes that are cancerous as represented by the network.

12. A computer-executable method, comprising:

representing cell specialization over time using a spatial network and a transcriptional network on a computing device;

modeling a relationship between the spatial network and the transcriptional network so that coregulated gene content emerges according to chromosome associations; aligning the spatial network and the transcriptional network so that they become mutually related; and

determining a desirable behavior of the mutually related networks by identifying entropic network configurations in accordance with the input.

13. The method of Claim 12, wherein modeling further models the transcriptional network so that the coregulated genes precede and shape the spatial network.

14. The method of Claim 13, further comprising feeding back between the spatial network and the transcriptional network to fine tune cell specific network configurations

15. A computer-readable medium having computer-executable instructions stored thereon to implement a computer- implementable method, comprising:

representing cell specialization over time using a spatial network and a transcriptional network on a computing device;

modeling a relationship between the spatial network and the transcriptional network so that coregulated gene content emerges according to chromosome associations; aligning the spatial network and the transcriptional network so that they become mutually related; and

determining a desirable behavior of the mutually related networks by identifying entropic network configurations in accordance with the input.

16. The computer-readable medium of Claim 15, wherein modeling further models the transcriptional network so that the coregulated genes precede and shape the spatial network.

17. The computer-readable medium of Claim 16, further comprising feeding back between the spatial network and the transcriptional network to fine tune cell specific network configurations.

Description:
DYNAMICS AND CONTROL OF STATE-DEPENDENT NETWORKS FOR

PROBING GENOMIC ORGANIZATION

CROSS-REFERENCE TO A RELATED APPLICATION This application claims the benefit of Provisional Application No. 61/533776, filed September 12, 2011, and Provisional Application No. 61/700225, filed September 12, 2012, both of which are incorporated herein by reference.

STATEMENT OF GOVERNMENT LICENSE RIGHTS

This invention was made with Government support under Grant #1K25DK08791-

01A109, Grant #R37DK44746, and Grant #R01 HL65440 awarded by the National Institutes of Health; Grant #CMMI-0856737 awarded by the National Science Foundation, and Grant #FA9550-09- 1-0091 awarded by the Air Force Office of Scientific Research. The Government has certain rights in the invention.

BACKGROUND

In recent years, network science has emerged as a powerful conceptual paradigm in the biological sciences. The reason for this is twofold. First, it has become desirable to gain a deeper understanding of the role of interelemental interactions in the collective functionality of cellular organisms. It has also become increasingly clear that networked systems in biology often intricately evolve with multiple time scales, a property dictated by how the element or node dynamics intertwine with global network dynamics and functionality. A basic premise in network science is that the structure of the network influences the dynamical and functional properties exhibited at the system level. In this context, the relationship between higher levels of connectivity in the network and the convergence rate to certain biological equilibria or limit cycle can be examined.

SUMMARY

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. One aspect of the subject matter includes a method form which recites a computer executable method. The method comprises representing a dynamic system as nodes being networked to zero or more nodes which together represent a network on a computing device. The method further comprises analyzing by the computing device a diffusion dynamics over the network as a derivative of geometric state of the system over time, which is a first sum of two summands. The first summand is a product of two multiplicands and the second summand is a first product of an indication that an input matrix that might be state dependent and an input to the network. The second summand is a negative second product of two multiplicands. The first multiplicand is the geometric state of the system as a function of time. The second multiplicand is a second sum of two summands. The third summand of the second sum is a matrix Kronecker product of a weighted Laplacian of the network and an identity matrix. The fourth summand of the second sum is a diagonalizable matrix with a vector of the indication that the input matrix might be state dependent on its diagonal. The method also comprises determining a desirable behavior of the network by identifying entropic network configurations in accordance with the input.

Another aspect of the present subject matter includes another method form which recites another computer-executable method. The method comprises representing cell specialization over time using a spatial network and a transcriptional network on a computing device. The method also comprises modeling a relationship between the spatial network and the transcriptional network so that coregulated gene content emerges according to chromosome associations. The method additionally comprises aligning the spatial network and the transcriptional network so that they become mutually related. The method further comprises determining a desirable behavior of the mutually related networks by identifying entropic network configurations in accordance with the input.

A further aspect of the present subject matter includes a computer-readable form which recites a computer-readable medium having computer-executable instructions stored thereon to implement a computer implementable method. The method comprises representing cell specialization over time using a spatial network and a transcriptional network on a computing device. The method also comprises modeling a relationship between the spatial network and the transcriptional network so that coregulated gene content emerges according to chromosome associations. The method additionally comprises aligning the spatial network and the transcriptional network so that they become mutually related. The method further comprises determining a desirable behavior of the mutually related networks by identifying entropic network configurations in accordance with the input.

DESCRIPTION OF THE DRAWINGS

The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:

FIGURE 1 illustrates block diagrams in accordance with various embodiments of the present subject matter;

FIGURE 2 illustrates pictorial diagrams in accordance with various embodimentes of the present subject matter;

FIGURE 3 illustrates a pictorial diagram in accordance with various embodiments of the present subject matter;

FIGURE 4 illustrates a block diagram in accordance with various embodiments of the present subject matter;

FIGURE 5 illustrates a pictorial diagram in accordance with various embodiments of the present subject matter;

FIGURE 6 illustrates various pictorial diagrams in accordance with various embodiments of the present subject matter;

FIGURE 7 illustrates various pictorial diagrams in accordance with various embodiments of the present subject matter;

FIGURE 8 illustrates various pictorial diagrams in accordance with various embodiments of the present subject matter; and

FIGURE 9 illustrates various pictorial diagram including pictorial diagram A, which illustrates a correlation matrix for chromosome 1 for fibroblasts + MyoD, based on Hi-C data; pictorial diagram B which illustrate predicted correlation matrix for chromosome 1 for fibroblasts + MyoD, based on the method disclosed herein; pictorial diagram C which illustrates error (predicted - actual matrix); and pictorial diagram D which illustrates the circles representing MyoD binding sites from ChlP-seq data and the dotted lines are the method-predicted dominant binding sites that are hypothesized to be sufficient to induce conversion from fibroblasts to myo tubes. DETAILED DESCRIPTION

Various embodiments of the present subject matter highlight the elaborate connections between network structure and network function in the context of the genome using computing hardware, software, or their combinations. In particular, it is examined, via empirical findings and abstraction based models, the dynamics and control properties of chromosomal networks during cell differentiation. Structural reorganization in the nucleus during differentiation can be captured by considering the genome as a state dependent dynamic network where evolving chromosomal geometry determines network structure. It is supposed that a moment exists during terminal cell differentiation, perhaps coincident with cell cycle withdrawal, where the architecture and transcriptional networks undergo a unique change in their mutual relationships, to thus configure a developmentally informed "alignment."

FIGURES 1(A1), 1(A2), and B discuss mechanics of cell specialization. The relationship is illustrated between spatial (form) and transcriptional (function) networks over time during cell specialization. Alignment of the networks, where the architecture and transcriptional networks become mutually related. Prior to the aligned state, at least two models are used to describe the relationship between architecture and transcriptional networks. The first proposes that overall coregulated gene content (function) emerges according to overall chromosome associations (form). See FIGURE 1(A1). In other words, the genome at the chromosomal level self organizes to facilitate coordinated gene regulation during differentiation. Thus form precedes function. The second model proposes that the transcriptional network and coregulated genes precede and shape the architecture network, or that form follows function. See FIGURE 1(A2). Under either model, commitment to terminal differentiation is associated with network alignment, and after alignment, feedback between the two networks allows fine tuning to achieve the desired cell specific network configurations. See FIGURE (IB).

It is further supposed herein that reorganization of chromosomal architecture minimizes the total information content or entropy. Transcription factories are an example of this, as, in principle, they increase the degree of coordination of the transcription of gene. However, various embodiments of the present subject matter recognize some issues, such as whether transcription factories are stable or whether they spontaneously and transiently self organize, and whether developmental state influences the status and dynamics of transcription factories. Here, an abstraction based formalism is provided, based on dynamic networks and their controllability properties, to shed light on the multifaceted aspects of nuclear organization underlying gene expression. First the notion of dynamic state dependent networks and their realizations in genomic organization is expanded upon. Subsequently, the dynamics and control properties of such networks in the context of cell differentiation was examined.

Focusing now on state-dependent networks, FIGURE 2 illustrates various pictorial diagrams that depict the open and closed chromatin domains throughout the genome which occupy different spatial compartments in the nucleus. FIGURE 2(A) is an image of an interphase nucleus labeled by spectral karyotyping (SKY). All chromosomes are labeled with a unique color to visualize their territories. Analysis of SKY data reveal spatial relationships between each pair of chromosomes. FIGURE 2(B) is a correlation map of a chromosome generated from Hi-C. A correlation matrix illustrates the correlation (range from 1 (black) to +1 (red)) between the intrachromosomal interaction profiles of every pair of 1-Mb segments along each chromosome. FIGURE 2(C1) is the transcription profile along the chromosome. FIGURE 2(C2) is the open and closed chromatin profile along the chromosome (based on DNase I hypersensitivity). The two dominant proximity patterns, grey and black, correlate strongly with open and closed chromatin. (D1-D3) A state-dependent network representation of proximity patterns. The faces of each element are color-coded. See FIGURE 2(D1). The elements can exchange information when the same color-coded sides are facing each other. See FIGURE 2(D2). The interaction graph associated with the elements of FIGURE 2(D1) is illustrated. See FIGURE 2(D3). Consider the nucleus as a dynamical system composed of many interacting elements, among them networks having variable interactions with each other, for example, the networks of coregulated genes and chromosomal adjacencies. The emergent property of complex interactions among these elements defines the specific characteristics of an individual cell. Thus, it is described that the nucleus as self-organized because all interacting elements lead to a defined state, or signature, of that cell type. Networks within the nucleus could rewire in both space and time, if, for example, the mutual exchange of information between the coregulated gene network and the chromosomal interaction network changes. Viewing elements within the nucleus as networks allows assignment of quantifiable values such as intranuclear positions of chromosomes, and comparison of these values over time may then provide a framework for studying the process of differentiation as well as how nuclear organization generally affects the properties of a cell. Recently, a method called Hi-C was developed that probes the three-dimensional architecture of whole genomes by coupling cross-linking of chromosomal sites with massively parallel sequencing. Using Hi-C, spatial proximity maps of the human genome were constructed at a resolution of 0.1-1.0 megabase pairs in two human hematopoietic cell lines representing distinct hematopoietic lineages, one derived from B-lymphocytes and the other an erythroleukemia-derived line. These maps confirmed the presence of chromosome territories and the spatial proximity of small, gene-rich chromosomes. See FIGURE 2(A). The maps also identified an additional level of genome organization that is characterized by the spatial segregation of open and closed chromatin to form two genome-wide compartments. See FIGURE 2(B). Although compartment patterns in these two cell types were similar, many loci were discordant. Moreover, a strong correlation between the compartment pattern with transcription and chromatin accessibility was observed. See FIGURE 2(C1) and (C2). These results demonstrate that open (euchromatin) and closed (heterochromatin) chromatin domains throughout the genome occupy different spatial compartments in the nucleus and that these patterns distinguish specific cell types or states. To shed light on and formalize how chromosomal organization in the nucleus can lead to distinct functional and architecturally distinct networks, the notion of state-dependent networks is of interest. Thus one starts with a transparent example that abstracts the underlying phenomena— "cubes" are used instead of genes and chromosomes for this purpose. Here the ramifications of adopting such a point of view in the context of dynamics and control of genomic organization are highlighted. Consider a set of cubes with color-coded faces that can rotate about their respective geometric centers. It is assumed that each color represents one type of modality for interaction with other elements in this system of cubes. Moreover, assume that each pair can interact if the correct color sides are facing each other. As an example, when the elements are color-coded (as in FIGURE 2[D1]), it can be required that they can interact only when the white or the black sides are facing each other. Hence for the arrangement in FIGURE 2(D2), the interaction graph of FIGURE 2(D3) is obtained. Evidently, as the rotational states of these cubical elements evolve over time, a sequence of interaction graphs is obtained; in particular, it was realized that the corresponding interaction graph is state-dependent and, in general, dynamic. Another example of a state-dependent graph, of particular relevance to a biological setting, is the distance-induced interaction model. In this framework, the interaction or coregulation among various parts of the genome is a function of their respective relative positions (see FIGURE 2B). Naturally, as the configuration of these elements evolve in time, the underlying interaction network evolves in time as well, resulting in a dynamic proximity graph.

More generally, a state-dependent graph is a mapping, g$, from the system configuration space Q to the set of all labeled graphs on n vertices G(n); that is, g s Q→G {n) , (1) where S specifies the edge-state-dependency set. We will subsequently denote g$(q) by

Gq to highlight the dependency of the resulting graph on the state q e Q. It is assumed that the order of these graphs, n, is fixed. Their edge set, E(g s (q)), however, is a function of the state q. Now one needs to specify further how the state of the system dictates the existence of an edge between a pair of vertices in the state-dependent graph. This is achieved by considering the subset _Ξ Qi x Qj, where Qi and Qj are the state spaces of nodes i and j, respectively, and requiring that { i,j}G E(g$(q)) if and only if Sij; Sy

-Ί- is called the "edge states" of vertices i and j. If it is also assumed that the edge sets are such that Sji for all i , then the resulting interaction graph is undirected; in general state dependency leads to a directed graph. The collections of edge states for all pairs of nodes in the network constitutes the edge- state-dependency set S in Eq. 1.

In the context of weighted graphs, an analogous point of view can be adopted in regard to the state dependency of the edge weights. For example, the weight on the interaction link between nodes i and j can be viewed as being inversely proportional to their distance, translating to a stronger interaction between the nodes that are geometrically closer.

Focusing on feedback and control of state-dependent networks, Previously it was found that during differentiation of a hematopoietic progenitor into the erythroid or neutrophil lineages, the degree of intranuclear order changes, as captured by computing the total entropy in the system. See FIGURE 3, which illustrates dynamics of order during cell specialization. When a progenitor commits to either the erythroid (black) or the neutrophil lineage (blue), there is a concomitant increase in order, eventually stabilizing at a level greater than that of the original multipotent progenitor.

As the progenitor commits to either lineage, order decreases— entropy increases— to a minimum, which is defined herein as the metastable state, then increases and eventually stabilizes at greater order than that of the original progenitor as terminal differentiation is achieved. See FIGURE 3. The metastable state is a necessary phase transition for advancement to a more highly ordered state that characterizes more advanced (committed) stages of cellular differentiation. This may be a general property of differentiation in other lineages as well, where cell-specific organization of the nucleus emerges from mutual interactions between gene coregulation and chromosomal architecture. See FIGURE 1(B).

As the nucleus "reorganizes," loci containing up-regulated genes move from a repressive to an active nuclear compartment, whereas loci containing down-regulated genes move in the converse manner. Global reorganization of chromosome proximities also occurs during differentiation. However, it is unclear whether local changes in positioning, for example, looping of loci from chromosome territories, drive global reorganization on the whole chromosome level, or vice versa. The two models discussed previously are compatible with this scheme. Specific acquired mutations could interfere with this process and direct cells down an alternative pathway, particularly if an additional perturbation were to occur in the metastable state. Bypassing network alignment may lead to an alternative quasistable, or less random, ordered state that might in some cases give rise to further mutations and emergence of almost any conceivable abnormal phenotype. It is supposed herein that the intranuclear systems that are transitioning from a less-ordered state to a highly ordered state are losing controllability, or the ability to respond to specific external cues. A highly ordered system is generally less controllable than a disordered one. The notion that the metastable state is the most controllable one suggests the possibility that it is likely to be more responsive to therapeutic interventions.

In the context of state-dependent networks, consider the relative translational configuration of chromosome i with respect to chromosome j at time instance k, denoted by ei(k). The configuration of the chromosomes induced a geometric network (either a weighted network or with the help of a threshold, a graph); this network will be referred to as the cell proximity network G q (k subsequently denoted by Q(k). It is convenient to view this network as the structural organization of the cell. In view of the earlier discussion, it is noted that the cell proximity network is a state-dependent graph that is induced by the relative configurations of various chromosomes in the cell. Let E(k) denote, on the other hand, the gene regulatory network in the cell, which can be viewed as the functional organization of the cell. Moreover, these two cell networks Q(k) and E(k) are not only related, but are also highly correlated via feedback. See FIGURE 4, which illustrates feedback between cell function and cell proximity network. The initial conditions Q(0) and <f(0) are determined following an intricate alignment between form and function.

Focusing now on the introduction of control using MyoD and GATA-1, adopting the point of view of state-dependent networks, in conjunction with the premise that form and function of the cell are directly correlated, allows one to consider control mechanisms of the chromosomal network that derive the network geometry from an initial configuration toward a specific cell type. This is particularly exemplified by transcription factors that have broad influence on cell fate, such as MyoD. In addition to its role in myoblasts as a transcription factor regulating expression of skeletal muscle genes, MyoD can convert fibroblasts to skeletal muscle cells by activating the skeletal muscle differentiation program. Recent studies indicate that MyoD binds and induces histone modifications at tens of thousands of sites in the myoblast prior to transcription of most skeletal muscle genes, suggesting a potentially more global role in cell specification.

Another "master regulator" is GATA-1, a zinc finger transcription factor essential to maintenance of the erythroid and megakaryocyte lineages. GATA-1 may have a global impact on nuclear organization by catalyzing interactions within and among the coregulated gene and chromosome topology networks. Others have used ChiP-Seq methods to identify the spatial distribution of ds-regulatory elements targeted by GATA-1, and they determine criteria for distinguishing between target sites that promote activation versus repression of genes during erythroid development. The broad influence of these transcription factors provides a platform for understanding control over development of specific cell types. The proposed framework may also help identify previously uncharacterized master regulators that globally influence cell fate.

Discussing now the embedding dynamics and control in state-dependent networks, the network architecture dependent on the node dynamics allows for a more explicit reasoning about the dynamical properties of the network geometry itself, and via the feedback and its function. See FIGURE 4. In the special case where the interaction between a pair of nodes i and j is dictated by their geometric states q^ and qj, the dynamics at the level of the state induces a corresponding dynamics at the network level, which in turn can further influence the evolution of the geometric states of the nodes. Representing the combined dynamics in continuous time assumes the form q (t) = f (q (t) , Q (t) ) and Q {t) = g (q (t) , Q (t)) , (2)

T / \ T T

where q (t) = \ q x (t) , . . . , q n (t) and Q{t) represent, respectively, the geometric state of the genome and the cell proximity-induced interaction network, at time instance t. The exact form of functions / and g in Eq. 2 lead to distinct dynamical features for the evolution of the geometric state and interaction network, particularly in relation to time scales, decomposability, and state and network equilibria. Such state-dependent network models lead to a unique set of challenges to the field of genomics as well as systems and control research. In order to illustrate some of these challenges via a simple example, and highlight the utility of a control theoretic perspective on genomic organization, consider a state-dependent proximity network. Assume that the nodes in the network have adopted a diffusion-like interaction scheme for synchronization of their translational dynamics, which possibly after a coordinate transformation, has resulted in the perturbed diffusion dynamics on the interaction network,

¾ ( =∑^ fe ( - ¾ ( - ¾- ( ) . where q^f) represents the translational coordinate of node i in the cell (with respect to some coordinate frame), which in turn evolves according to a local weighted gradient induced by the other nodes, including an offset induced by an ambient potential. The potentially time-varying offset also ensures that the network equilibrium assumes a definite geometry. This local interaction model leads to the collective dynamics of the form

where q(t) = cian

of the network, / denotes the 3x3 identity matrix, and "(g) " is the matrix Kronecker product. This model has been extensively studied in recent years due to its ramification for distributed estimation and control. In order to examine the state-dependent extension of the above diffusion model, let G denote the set of graphs of order n with vertex set

V={ 1,2,...,n} and edge set E={ij \ i=l,2,...,n-l =2,...,n,i<j}, with the weighting function w: R 3 x R 3 → R + , assigning to each edge ij a function of the distance between the two nodes i and j. Thus it possesses w y -=w( ,, 7 -)=f(<iy) where <iy= |l r / ll and /(<i y )= d" for some a < 0. In this case, the interaction between a pair of nodes weakens as they drift apart. This dependency of the edge weights on the relative distances between the nodes is in direct correspondence with the earlier discussion on the role of genomic organization. In the setup of the dynamics (Eq. 3), Depending on the exact form of the edge-weight dependency, the dynamic network evolution has a distinct character. For example, the stability of the state as well and the network equilibria depend on the weighting functions, assigning how the edge weights depend on the state of the nodes. Moreover, the initial conditions for the geometric states are reflected in the resulting state equilibrium and by association in the network geometric and functional equilibria.

Focusing now on the control of state-dependent networks, consider the "influenced version" of the diffusion dynamics over the network (Eq. 3), namely, q(t) = - ((L w (G) ® I ) + O g (B w )) q(t) + B w u(t), (4) where u denotes an external influence on the network, the O ' g(B w ) is a diagonal matrix with the vector B w on its diagonal, and B w indicates that the input matrix that might also be state-dependent. Within the context of this example, a few observations are pertinent: (i) whereas the state-independent diffusion dynamics over a connected network has a one-dimensional set of equilibria (namely, the translation of the span of the vector of all ones), the state-dependent dynamics might have multiple distinct equilibria as a function of the input u, (ii) the presence of an external influence can potentially drive the dynamics from one equilibria to the next, leading to a new network formation. In order to assess the network configuration that is more amenable to external influence, the notion of controllability Gramian becomes particularly useful and provides a direct connection with the role of network entropy in the metastable state of the cell discussed earlier. See FIGURE 3.

Controllability Gramian for a network measures how controllable certain modes of the network are and which modal directions take less and more energy to be steered. See FIGURE 5, which illustrates controllability Gramian that characterizes the minimum energy input to the linear system x = Ax + Bu so as to steer the initial zero state at infinite past to a point on the unit ball at t = 0 (Top). Moreover, this Gramian defines how inputs on the unit sphere (bottom left of FIGURE 5), such as noise or external signals, map to system states: Directions that are more controllable are characterized by elongated ellipsoidal axes, whereas the shortened axes are less controllable directions (bottom right of FIGURE 5).

In the case where interaction between the nodes is inversely proportional to their distance, the closer the nodes come together, they have higher interaction with each other, and it is conceivable that the network becomes less controllable by an external signal such as MyoD. See FIGURE 6, which illustrates the aftermath in which a control signal is introduced for a diffusion-like protocol on state dependent weighted networks. The Euclidean distance matrices obtained from the experimental data are processed via a Euclidean embedding algorithm to obtain a realization for the network, which is then subjected to injected signal for the duration of l's. The controllability of the graph with respect to this input is then measured with respect to the two dominant directions over an interval. The network assumes a configuration that has a higher level of controllability with respect to the initial configuration, in direct correspondence with the metastable configuration during cell differentiation. Depending on the initial state of the nodes and their relative states with respect to the injected signal, the network's controllability can assume distinct profiles: initial higher levels of controllability during the signal injection (upper left) as measured by the volume of the controllability ellipsoid in the two dominant directions, or initial lower levels of controllability followed by a higher level controllability (shown on the subsequent panels). The non-smooth segment in each figure corresponds to the removal of injected signal.

Using linearization of the state-dependent network, the uncontrollability of the network can also be directly related to its network structure, in particular, to its symmetry. In general, symmetry in the network with respect to the external input leads to uncontrollable networks and singular controllability Gramians. See FIGURE 7(A) and (B), which illustrate network views of cellular reprogramming. Network diagrams A and B represent uncontrollable and controllable networks for the diffusion-like dynamics on graphs with one injected signal (black filled node). In the meantime, the controllability Gramian directly relates to how noise injected in the network maps to the state of the network and thus can be directly related to the network covariance and entropy. The general observation is that a higher degree of controllability relates to the determinant of the network covariance matrix— which in turn, translates to interpretation of network controllability in terms of the network entropy.

Focusing on control from a subset of nodes, consider the controllability of the network in the neighborhood of the linearization point and the corresponding controllability Gramian structure. In this context one can examine local controllability of the state of the network from a small subset of nodes in the graph. It turns out that the structure of the network, as viewed from that node, has a direct implication for the system properties of the network, including its controllability. The notion of symmetry has been refined to that of nodal domains, which refers to the partition of the network in terms of the signs of the entries of the eigenvectors of the combinatorial Laplacian.

Consider again the diffusion dynamics (Eq. 4) with augmented inputs, such as, MyoD. Using this abstraction-based model, "most influential" external interactions with the network can be characterized. That is, there is an interest in identifying strategically optimal locations for the external input to exert influence, which in turn is dependent on the number of external signals and the structure of the network. For example, in the case of one external agent, it might be advantageous to locate this agent close to some "central" location in the network, where the distance (the number of edges) needed to be traversed in order to reach the farthest node, is minimized. However, in the case of two external signals, it is conceivable that their optimal placement would be at two peripheral locations in the network. This externally influenced diffusion dynamics (Eq. 4) that accepts inputs from external signals provides a desired abstraction-based setting for a system-theoretic characterization of the influential location(s) in a genomic network. An instrumental construct for this purpose is the controllability Gramian. Consider a linear-time invariant model, with system and input matrices A and B, respectively, and a state denoted by x. Define the controllability operator Ψ ε : (-∞,0]— > R n by which can be viewed as the response of the linear system with initial condition x(-∞)=0 to an input w e (-∞,0] ; (-∞,0] ; denotes the space of square integrable functions with (-∞,0] as their domain. Given the initial condition jc(0) with unit norm, it is now desired to find the smallest norm control input w e Ζ^ (-∞ ; 0] that solves the functional equation

Ψ ε ιι = x(0) (note that Ψ ε is an operator on a function space and u is a function of time) ; that is, it is desired to find the minimum norm control that steers, respectively, and a state denoted by t=-∞ to the state jc(0) on the unit circle at t=0. See FIGURE 5.

Standard arguments in linear systems theory then leads to two observations: assuming that the linear system is controllable, (i) the minimum norm control is

Ύ

parameterized by u = x ¥ c X c x 0 ( Ψ ε is the adjoint of Ψ ε ), with norm x 0 X c x 0 , where X c , the controllability Gramian, is the positive definite solution of the matrix equation

AX C + X C A T + BB T = 0 , and (ii) the states reachable with control inputs with norms bounded by one are characterized by the ellipsoid = X 2 A; C , |A; c | < l] , whose axes are the eigenvectors of the controllability Gramian X c . In fact, if X c has eigenvalues X^ and Xj such that X^ » then there is more "stretching" in the direction of the normalized eigenvector corresponding to X^ as compared with the normalized eigenvector corresponding to Xj.

That is, the it direction is deemed "more controllable" than direction j.

The upshot of the aforementioned discussion is the following: The selection of the location for injecting steering signals in the network partitions the underlying network in Eq. 4, resulting in system and input matrices that, in turn, determine the controllability Gramian X c . The controllability Gramian, on the other hand, characterizes which directions are more controllable than others. An influential location in the network is thus the node in the graph that leads to a controllability Gramian with a spectrum that stays away from the origin in the desirable directions. More precisely, suppose that in Eq. 4, we denote by A(v) and B(v) the system and input matrices that are obtained when v is selected as the influenced node, and X c (v) is the resulting controllability Gramian.

Then the optimization problem (5) quantifies the most influential node in the network when measured with respect to providing most controllability in the ith direction. We note that the optimization problem (Eq. 5) can be extended to the case where more than one external signal (e.g., transcription factors) are presented to the network. Adopting this point of view for controlling cell organization provides a unifying perspective on cellular processes such as differentiation as well as a framework for more systematic reasoning on the role and efficiency of input signals in steering cell organization to particular network equilibria. See FIGURE 8 which illlustrates control from a subset of nodes. With specific transcription factors (TF) as input signals, primary human fibroblasts (S I) transition to the metastable state (S2), where bifurcation takes place and two possible paths emerge. One leads to the normally specialized state (S3), and the other to an abnormally specialized state (S5). Interventions applied at S2 can influence the path taken. TF targeting a particular subset of nodes can also drive a transition from state S3 to another normally specialized state S4. The red and green circles denote the targeted nodes in each state. Furthermore, the presently disclosed methods can provide a means of evaluating and refining cell reprogramming strategies that rely on ectopic expression of transcription factors, such as derivation of iPS cells from a wide variety of differentiated cell types.

To sum, the dynamics and control of state-dependent graphs provide the basis for the notion that nuclear reorganization occurs at the time of cellular specification and both precedes and facilitates the orchestrated activation of transcriptional networks associated with subsequent cell differentiation. The processes of differentiation and reprogramming are related, and therefore they may have similar, or at least mutually resonating, reception to control. Recent breakthroughs in reprogramming somatic cells back to an ES cell-like state using just four genes, as well as more direct reprogramming routes, for example, the overexpression of the master regulator MyoD, to generate myogenic cells or a combination of transcription factors to elicit neuronal differentiations, have all shown how cellular differentiation systems can be controllable at some early determinative step.

Both of the foregoing strategies of reprogramming rely on expression of transcription factors to induce global changes, and it is likely that these changes are reflected in nuclear organization. From a control point of view, cellular reprogramming changes nuclear organization, thereby creating an environment that propels a system into a desired state. It has been shown that control over systems can be acquired by altering nuclear organization. The methods provide a means to identify critical subsets of nodes within state-dependent networks that lead to a specific specialized state through cellular differentiation. These nodes are highly relevant to mechanisms of reprogramming, or acquisition of control. The modeling asserts that their defining characteristic will be that recreating their specific configuration in an alternative cell type will inevitably lead it to the specialized state of the original cell through the same or a highly similar differentiation pathway. Adopting such a state-dependent network point of view to genome organization would also provide a more systematic means of reasoning about therapeutic interventions for cell processes that have become "out of control" such as cancer.

Controllability of dynamic state-dependent networks provides an intriguing previously undescribed vista into steering complex and multifaceted interactions in the genome toward biologically desirable configurations. Network controllability can be formalized in various distinct forms. For example, in structural controllability, the network is assumed to be controllable if arbitrary weights on certain permissible interaction links lead to a controllable configuration. As the dynamic and state-dependent nature of the chromosomal network requires reasoning on the "degree" of controllability in a setting where edges can appear or strengthen in a state-dependent dynamic manner, the use of system-theoretic Gramians have been adopted to examine and formalize network response and controllability properties. It is worth examining the contribution of various notions of controllability for understanding distinct aspects of cell organization and differentiation, as well as the complementary means of viewing cell reorganization via local energy optimization principles and game theoretic network formation.

The presently disclose methodology also has applications for designing influence mechanisms, e.g., marketing strategies, on social networks. Moreover, in the case where a disease spreads over a network of interacting individuals and populations, the disclosed methodology can be used to design vaccination strategies to influence the progression of an infectious disease. In the context of engineering, the correspondence between the notions of network controllability and entropy has direct implications for designing robust robotic networks, for example, those that are less controllable (i.e., more secure) with respect to external perturbations.

Focusing more specifically on controllability, controllability refers to the ability to steer a dynamical system via its input between arbitrary pair of states. In this direction, suppose that it is given a controlled system of the form x(t) = f (x(t), u(t), w(t)) where x denotes the state of the system, whose evolution is dictated by the control input u, and the disturbance/noise signal w. Given an arbitrary pair of states x 0 and x the system (is called controllable if there exists a control input u such that for any ε > 0, there exists a t such that When the dynamic system is specialized to linear systems, such as those obtained from diffusion models on graphs, a system is obtained of the form x(t) - Ax(t) + B u u(t) + B w w(t) where B U and B W characterize, respectively, how the input and the disturbance influence the dynamics.

When the matrix A is stable, the controllability Gramian is defined as the solution of the Lyapunov equation

The Gramian P is positive definite when the pair (A,B) is controllable, indicating that for any pair of initial and final states, there is a control signal that steers the system from one state to the other.

Here are the steps of the method to introduce a control signal that steers the system. For the first step, for estimating entropy, it is assumed that one has drawn iid samples · · · > ■½} from an underlying distribution where x i e . d is a dxl vector, and the samples have mean x and scaled sample covariance S = ^^ (^- ~ x ) ( x i ~ x ) T

For estimating relative entropy, it is assumed that one has drawn iid J-dimensional samples from both distributions, and we denote the samples from the first distribution as an d the samples from the second distribution as {¾, 1 > ¾,2' · · ·'¾,¾} · The empirical means are denoted x x and x 2 , and the scaled sample covariances are denoted S l =∑^ (¾ - ¾ ) (¾ - ¾ f and 5 2 =∑^ (¾· - ¾ ) (¾, « - ¾ f .

For some approaches∑ has been treated herein as a random covariance matrix and μ as a random mean vector, with respective realizations marked∑ and μ . Expectations are always taken with respect to the posterior distribution unless otherwise noted. The digamma function is denoted ψ(ζ) =— 1η Γ(ζ) , where Γ is the standard gamma dz

function, and denotes the standard multi-dimensional gamma function. For the multivariate Gaussian distribution, the entropy goes as the log determinant of the covariance; specifically, the differential entropy of a J-dimensional random vector X drawn from the Gaussian N (x) = N (x; ,∑) is

, , r , , , , d ά \η (2π) ln |∑| h (N) = x -N (x) \n N (x) dx =→ ^— 2 ' ^

For the one-dimensional uniform over support [b, a], a, b e R, and a > b, the differential entropy is ln(a - b). The uniformly minimum variance unbiased entropy estimate for the Gaussian is,

In this task, motivated by the analogous situation in cell differentiation, consider the role of the external agents in influencing the behavior of asymptotic properties of state-dependent graphs. In this direction, it is proposed to examine the following: how effective can external agents, attached to a subset of the state-dependent network be, in order to steer the network toward a particular equilibrium geometry? Let first consider the controllability of the state-dependent networks, referred as the ^-process, in the case where agents's dynamics is independent of the graph structure, and moreover, that each agent is controllable (referred to the agents dynamics as the ^-process). In such a setup it is employed the notion of "regularity" that allows one to infer the controllability of the ^-process, that is, the ability to control the trajectory of a dynamic state-dependent network, directly in terms of the controllability of the ^-process. Controllability Gramian for a network measures how controllable certain modes of the network are and which modal directions take less and more energy to be steered. In the case where interaction between the nodes is inversely proportional to their distance, the closer the nodes come together, they have higher interaction with each other and it is conceivable that the network becomes less controllable by an external signal such as MyoD.

For the second step, assume that the observations X : pxl and Y : pxl in the state 1 and state 2 have multivariate normal distributions with covariance matrices∑ and Ω, respectively. Based on independent samples from the treatment and state 2 it is obtained two independent /^-dimensional sample covariance matrices S and T such that mS ~ W p (m,∑) and nT ~ W p («,∑) , the Wishart distributions with m and n degrees of freedom, respectively. Consider the classical problem of testing

Ho : Σ = Ω vs. K :∑≠il (8) based on S and T.

compare∑ to Ω by considering all possible covariance differences

Cov (fl / X,b / X ) - Cov (fl / F,b'F)≡ a∑b - a'Q.b , as estimated by a'Sb - a'Tb

Here a, and b are p-dimensional column vectors with a,b≠ 0, so a'X represents a general nontrivial linear combination of the components of X, etc. They show that under

Η 0 :∑ = Ω [1],

Yar[a'Sb - a Tb] = \— + - (a'∑a) (b'∑b) + (a' 2

m n hence they propose the alternative statistic (slightly modified here)

Others evaluate D(S,T) and the maximizing vectors (a,b) explicitly as follows:

(w- 1/2 a,W ~1/2 b) f p ,

Here γ 1; ...,γ„ are the eigenvectors of w _1/2 (s-r)w -1/2

1 1

and j > ...≥f p are the corresponding eigenvalues It is also derived an asymptotic approximation to the distribution of ^D(S,T) under H Q .

From eq.7 and eq.8,

0<D 4 (5,r)<l,

and D 4 (S,T) = 0^S =T ,

The distance D can be extended to the case where Wis singular as follows: where a and b are now r x 1. Thus,

£>(S,r) = , (12)

(w- 1/2 a,W "1/2 b) / r ,

Here γ 1 ; . . .,γ Γ are the eigenvectors of w "1/2 (s - r)w -1/2

(14) and i> . . .≥f r are the corresponding eigenvalues. (Equivalently, f > . . .≥f r are the proper eigenvalues of (S - T)W + .)

STEP 3:

The matrix-theoretic notation used here is as follows: for a matrix A, R(A) and

N(A) denote, respectively, its range space and null space. Diagonal matrices will be written as D = diag{di, . . .,d n }, with d^ denoting the i-th entry on the diagonal. A matrix and/or a vector that consists of all zero entries will be denoted by 0; whereas, '0' will simply denote the scalar zero. Similarly, the vector 1 denote the vector of all ones, and J = 11 T . The notation g(n) = 0(f(n)) signifies that the function g(n) is bounded from above by some constant multiple of f(n) for large enough values of n. The set of real numbers will be denoted as , and II. II denotes the p-norm of its argument (e.g., p = 2,∞), which will be used for vectors, matrices, and system norms. For the set S, \S\ denotes its cardinality. An undirected (simple) graph G is specified by a vertex set G and an edge set G whose elements characterize the incidence relation between distinct pairs of G. The notation i ~ j is used to denote that node i is connected to node j, or equivalently, e = (i ) e G. Use of the |G| X|G| incidence matrix is pervasive including

E(G), for a graph with an arbitrary orientation, i.e., a graph whose edges have a head

(terminal node) and a tail (an initial node). The columns of E(G) are then indexed by the edge set, and the i-th row entry takes the value T if it is the initial node of the corresponding edge, '-Γ if it is the terminal node, and zero otherwise. From the definition of the incidence matrix it follows that the null space of its transpose, N(E(G) T ), contains the agreement subspace span{ l }. More generally, span{A} for the matrix A will be used to denote the subspace generated by linear combinations of its columns. The rank of the incidence matrix depends only on |G| and the number of its connected components. The diagonal matrix A(G) of the graph contains the degree of each vertex on its diagonal. The adjacency matrix, A(G), is the symmetric |G| x|G| matrix with zero on the diagonal and one in the ij-t position if node i is adjacent to node j. The (graph) Laplacian of G,

L(G) := E(G) E(G = A (G) - A(G) , (15) is a rank deficient positive semi-definite matrix. The eigenvalues are real and will be ordered and denoted as 0 = (G) < (G) < · · · < 6 | (G) . The integral quadratic cost of control, that is how much energy is required to steer one state to another is related to the inverse of the Gramian (when the system is controllable)- as such, better conditions gramian lead to an overall efficient steering by the control input. Higher values of the determinant indicate the absence of small eigenvalues, and as such, generally lead to a better conditioned gramian. On the other hand, when w is zero-mean Gaussian with unit co variance, and the matrix A is stable, the steady state covariance of the state is determined by solving the Lyapunov equation, Thus in the case where B w = B u , that is, when the ambient disturbance/noise enters the dynamics analogous to how the control signal would influence the dynamics, the steady state covariance∑ and the controllability gramian P are identical. As such, the determinant of the steady state covariance can be used as an indicator of how controllable the network is. Consider the controllability of the network in the neighborhood of the linearization point and the corresponding controllability gramian structure. In this context one can examine local controllability of the state of the network from a small subset of nodes in the graph. It turns out that the structure of the network, as viewed from that node, has a direct implication for the system properties of the network, including its controllability. As was mentioned, a refinement has been made to view the notion of symmetry to that of nodal domains, which refers to the partition of the network in terms of the signs of the entries of the eigenvectors of the combinatorial Laplacian.

While illustrative embodiments have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.