Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IP-BASED INTERCONNECTION OF SWITCHES WITH A LOGICAL CHASSIS
Document Type and Number:
WIPO Patent Application WO/2017/048432
Kind Code:
A1
Abstract:
One embodiment of the present invention provides a switch. The switch includes a logical channel apparatus and a tunnel apparatus. The logical channel apparatus associates a logical channel identifier of a logical channel with the switch and assigns an Internet Protocol (IP) address as switch identifier of the switch. The logical channel includes a plurality of member switches and the switch is a member switch of the logical channel. The IP address uniquely identifies the switch in the logical channel. The tunnel apparatus establishes a tunnel with a remote switch in the logical channel. An inter-switch packet from the switch is encapsulated in a tunnel header associated with the tunnel.

Inventors:
KOGANTI PHANIDHAR (US)
VOBBILISETTY SURESH (US)
Application Number:
PCT/US2016/047075
Publication Date:
March 23, 2017
Filing Date:
August 15, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BROCADE COMM SYSTEMS INC (US)
International Classes:
H04L12/28; H04L12/931
Foreign References:
EP2874359A12015-05-20
US20130223221A12013-08-29
Attorney, Agent or Firm:
YAO, Shun (US)
Download PDF:
Claims:
What Is Claimed Is:

1. A switch, comprising:

a logical chassis apparatus configured to:

associate a logical chassis identifier of a logical chassis with the switch, wherein the logical chassis includes a plurality of member switches, and wherein the switch is a member switch of the logical chassis; and

assign an Internet Protocol (IP) address as switch identifier of the switch, wherein the IP address uniquely identify the switch in the fabric switch;

and

a tunnel apparatus configured to establish a tunnel with a remote switch in the logical chassis, wherein an inter-switch packet from the switch is encapsulated in a tunnel header associated with the tunnel.

2. The switch of claim 1, wherein the logical chassis apparatus is further configured to maintain a mapped identifier assigned to the switch, wherein the mapped identifier is an index for the switch in the logical chassis.

3. The switch of claim 2, wherein a port of the switch is identified by a port identifier uniquely identifying the port in the logical chassis, and wherein the port identifier includes the mapped identifier.

4. The switch of claim 1, wherein the logical chassis apparatus is further configured to determine adjacency in the logical chassis by running a routing protocol.

5. The switch of claim 1, wherein the logical chassis apparatus is configured to operate the logical chassis as a single manageable entity for provisioning, control, or both.

6. The switch of claim 5, wherein the logical chassis apparatus is further configured to manage the logical chassis based on one or more of:

a command line interface (CLI);

a Network Configuration Protocol (NETCONF); and

RESTCONF.

7. The switch of claim 1, wherein the switch is in a logical unit, wherein the logical unit is a building unit of a logical chassis.

8. The switch of claim 7, wherein the logical unit includes a second switch, and wherein the switch and the second switch operate as tunnel end points for the tunnel.

9. The switch of claim 1, further comprising a logical tunnel end point apparatus configured to operate the logical chassis as an end point for an external tunnel, wherein the other end point of the external tunnel is outside of the logical chassis.

10. The switch of claim 1, further comprising a link aggregation apparatus configured to:

identify a plurality of links coupled to a same neighbor switch; and

operate the identified links as a link aggregation group, wherein the links in the link aggregation group operate as a single logical link.

11. The switch of claim 1, wherein the switch is in a software defined network, and wherein the switch receives configuration information in an instruction from a controller of the software defined network.

12. The switch of claim 1, wherein the logical chassis apparatus is further configured to establish a point-to-point connection with a neighbor switch using an unnumbered interface based on the IP address.

13. The switch of claim 1, wherein the logical chassis apparatus is further configured to discover a neighbor switch based on a link discovery protocol.

AMENDED CLAIMS

received by the International Bureau on 04 January 2017 (04.01 .2017)

What Is Claimed Is: L A switch, comprising:

a logical chassis module configured to:

associate a fabric identifier with the switch, wherein the fabric identifier identifies a network of interconnected switches, and wherein the switch is a member switch of the network of interconnected switches; and

assign an Internet Protocol (IP) address as a switch identifier of the switch, wherein the IP address uniquely identify the switch in the network of interconnected switches; and

a tunnel module configured to establish a tunnel with a remote switch in the network of interconnected switches, wherein an inter-switch packet from the switch is encapsulated in a tunnel header associated with the tunnel. 2. The switch of claim I , wherein the logical chassis module is further con figured to maintain a mapped identifier, which is distinct from the switch identifier of the switch, associated with the switch, wherein the mapped identifier is an index for the switch in the network of interconnected switches. 3. The switch of claim 2, wherein a port of the switch is identified by a port identifier uniquely identifying the port in the network of interconnected switches, and wherein the port identifier includes the mapped identifier. 4. The switch of claim 1 , wherein the logical chassis module is further configured to determine adjacency in the network of interconnected switches by running a routing protocol. 5. The switch of claim 1, wherein the logical chassis module is configured to operate the network of interconnected switches as a single manageable entity for provisioning, control, or both. 6. The switch of claim 5, wherein the logical chassis module is further configured to manage the network of interconnected switches based on one or more of:

a command line interface (CU);

a Network Configuration Protocol (NETCONF); and RESTCONF.

7. The switch of cla im 0, wherein the switch is in a logical unit, wherein the logical unit is a building unit of the network of interconnected switches.

8. The switch of claim 7, wherein the logical unit includes a second switch, and wherein the switch and the second switch operate as tunnel end points for the tunnel.

9. The switch of claim 0, further comprising a logical tunnel end point module configured to operate the network of interconnected switches as an end point for an external tunnel, wherein the other end point of the external tunnel is outside of the network of interconnected switches.

10. The switch of claim 1 , further comprising a link aggregation module configured to:

identify a plurality of links coupled to a same neighbor switch; and operate the identified links as a link aggregation group, wherein the links in the link aggregation group operate as a single logical link.

1 1. The switch of claim 1 , wherein the switch is in a software defined network, and wherein the switch receives configuration information in an instruction from a controller of the software defined network.

12. The switch of claim 1 , wherein the logical chassis module is further configured to establish a point-to-point connection with a neighbor switch using an unnumbered interface based on the IP address.

13. The switch of claim 1 , wherein the logical chassis module is further configured to discover a neighbor switch based on a link discovery protocol.

Description:
IP-BASED INTERCONNECTION OF SWITCHES WITH

A LOGICAL CHASSIS

Inventors: Phanidhar Koganti and Suresh Vobbilisetty

BACKGROUND

Field

[0001] This disclosure relates to communication networks. More specifically, the present disclosure relates to a method for a constructing a scalable switching system. Related Art

[0002] The exponential growth of the Internet has made it a popular delivery medium for a variety of applications running on physical and virtual devices. Such applications have brought with them an increasing demand for bandwidth. As a result, equipment vendors race to build larger and faster switches with versatile capabilities, such as network virtualization and multi- tenancy, to accommodate diverse network demands efficiently. However, the size of a switch cannot grow infinitely. It is limited by physical space, power consumption, and design complexity, to name a few factors. Furthermore, switches with higher capability are usually more complex and expensive. More importantly, because an overly large and complex system often does not provide economy of scale, simply increasing the size and capability of a switch may prove economically unviable due to the increased per-port cost.

[0003] One way to increase the throughput of a switch system is to use switch stacking. In switch stacking, multiple smaller- scale, identical switches are interconnected in a special pattern to form a larger logical switch. However, switch stacking requires careful configuration of the ports and inter-switch links. The amount of required manual configuration becomes prohibitively complex and tedious when the stack reaches a certain size, which precludes switch stacking from being a practical option in building a large-scale switching system. Furthermore, a system based on stacked switches often has topology limitations which restrict the scalability of the system due to bandwidth considerations.

[0004] A flexible way to improve the scalability of a switch system is to build an interconnection of switches that share a single logical chassis (also referred to as "fabric switch"). A fabric switch is a collection of individual member switches. These member switches form a network of interconnected switches that can have an arbitrary number of ports and an arbitrary topology. As demands grow, customers can adopt a "pay as you grow" approach to scale up the capacity of the fabric switch.

[0005] While a fabric switch brings desirable features, some issues remain unsolved in efficient formation and data transportation of a scalable fabric switch.

SUMMARY

[0006] One embodiment of the present invention provides a switch. The switch includes a logical chassis apparatus and a tunnel apparatus. The logical chassis apparatus associates a logical chassis identifier of a logical chassis with the switch and assigns an Internet Protocol (IP) address as switch identifier of the switch. The logical chassis includes a plurality of member switches and the switch is a member switch of the logical chassis. The IP address uniquely identifies the switch in the logical chassis. The tunnel apparatus establishes a tunnel with a remote switch in the logical chassis. An inter-switch packet from the switch is encapsulated in a tunnel header associated with the tunnel.

[0007] In a variation on this embodiment, the logical chassis apparatus maintains a mapped identifier assigned to the switch. The mapped identifier is an index for the switch in the logical chassis.

[0008] In a further variation, a port of the switch is identified by a port identifier uniquely identifying the port in the logical chassis. This port identifier includes the mapped identifier.

[0009] In a variation on this embodiment, the logical chassis apparatus determines adjacency in the logical chassis by running a routing protocol.

[0010] In a variation on this embodiment, the logical chassis apparatus operates as the logical chassis as a single manageable entity for provisioning, control, or both.

[0011] In a further variation, the logical chassis apparatus manages the logical chassis based on one or more of: a command line interface (CLI), a Network Configuration Protocol (NETCONF), and RESTCONF.

[0012] In a variation on this embodiment, the switch is in a logical unit, which is a building unit of a logical chassis.

[0013] In a further variation, the logical unit includes a second switch, and the switch and the second switch operate as tunnel end points for the tunnel. [0014] In a variation on this embodiment, the switch also includes a logical tunnel end point apparatus, which operates the logical chassis as an end point for an external tunnel. The other end point of the external tunnel is outside of the logical chassis.

[0015] In a variation on this embodiment, the switch also includes a link aggregation apparatus, which identifies a plurality of links coupled to a same neighbor switch and operates the identified links as a link aggregation group. The links in the link aggregation group operate as a single logical link.

[0016] In a variation on this embodiment, the switch is in a software defined network and receives configuration information in an instruction from a controller of the software defined network.

[0017] In a variation on this embodiment, the logical chassis apparatus establishes a point-to-point connection with a neighbor switch using an unnumbered interface based on the IP address.

[0018] In a variation on this embodiment, the logical chassis apparatus discovers a neighbor switch based on a link discovery protocol.

BRIEF DESCRIPTION OF THE FIGURES

[0019] FIG. 1A illustrates an exemplary Internet-Protocol-based (IP-based) fabric switch, in accordance with an embodiment of the present invention.

[0020] FIG. IB illustrates an exemplary console for configuring an IP-based fabric switch, in accordance with an embodiment of the present invention.

[0021] FIG. 1C illustrates an exemplary configuration database for an IP-based fabric switch, in accordance with an embodiment of the present invention.

[0022] FIG. ID illustrates exemplary logical units in an IP-based fabric switch, in accordance with an embodiment of the present invention.

[0023] FIG. 2A illustrates exemplary inter-switch tunnels in an IP-based fabric switch, in accordance with an embodiment of the present invention.

[0024] FIG. 2B illustrates an exemplary tunnel encapsulation header for an IP-based fabric switch, in accordance with an embodiment of the present invention.

[0025] FIG. 3A illustrates an exemplary IP-based fabric switch participating in a software defined network, in accordance with an embodiment of the present invention.

[0026] FIG. 3B illustrates exemplary Fibre Channel (FC) gateways in an IP-based fabric switch, in accordance with an embodiment of the present invention. [0027] FIG. 4 presents a flowchart illustrating the fabric-formation process of a member switch in an IP-based fabric switch, in accordance with an embodiment of the present invention.

[0028] FIG. 5 illustrates an exemplary virtual link aggregation group in an IP-based fabric switch, in accordance with an embodiment of the present invention.

[0029] FIG. 6 illustrates an exemplary member switch in an IP-based fabric switch, in accordance with an embodiment of the present invention.

[0030] In the figures, like reference numerals refer to the same figure elements.

DETAILED DESCRIPTION

[0031] The following description is presented to enable any person skilled in the art to make and use the invention, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present invention. Thus, the present invention is not limited to the embodiments shown, but is to be accorded the widest scope consistent with the claims.

Overview

[0032] In embodiments of the present invention, the problem of building a versatile, cost-effective, and scalable switching system is solved by forming a topology agnostic fabric switch based on an internal (or underlay) layer-3 protocol. This internal layer-3 protocol operates within the fabric switch and may not advertise routes within the fabric switch outside. A respective switch of the fabric switch can be referred to as a member switch. One can form a large-scale switch using a number of smaller physical switches. In some embodiments, this fabric switch can appear as a single logical entity in the provisioning and control plane. This allows a user to provide configuration information to a member switch, which, in turn, propagates the configuration information to other member switches. In this way, a respective member switch can locally apply the configuration information.

[0033] In some embodiments, the control plane running on a respective member switch allows any number of switches to be connected in an arbitrary topology without requiring tedious manual configuration of the ports and links. This feature makes it possible to use many smaller, inexpensive switches to construct a large network, which can operate as a single switch in the data plane as well. When a member switch of such a fabric switch learns a media access control (MAC) address of an end device (e.g., via layer-2 MAC address learning), the member switch generates a notification message, includes the learned MAC address in the payload of the notification message, and sends the notification message to all other member switches of the fabric switch. In this way, a learned MAC address is shared among a respective member switch of the fabric switch.

[0034] It should be noted that a fabric switch is not the same as conventional switch stacking. In switch stacking, multiple switches are interconnected at a common location (often within the same rack), based on a particular topology, and manually configured in a particular way. These stacked switches typically share a common address, e.g., an IP address, so they can be addressed as a single switch externally. Furthermore, switch stacking requires a significant amount of manual configuration of the ports and inter-switch links. The need for manual configuration prohibits switch stacking from being a viable option in building a large-scale switching system. The topology restriction imposed by switch stacking also limits the number of switches that can be stacked. This is because it is very difficult, if not impossible, to design a stack topology that allows the overall switch bandwidth to scale adequately with the number of switch units.

[0035] In contrast, a fabric switch can include an arbitrary number of switches with individual addresses, can be based on an arbitrary topology, and does not require extensive manual configuration. The switches can reside in the same location, or be distributed over different locations. These features overcome the inherent limitations of switch stacking and make it possible to build a large "switch farm," which can be treated as a single, logical switch. Due to the automatic configuration capabilities of the fabric switch, an individual physical switch can dynamically join or leave the fabric switch without disrupting services to the rest of the network.

[0036] Furthermore, the automatic and dynamic configurability of the fabric switch allows a network operator to build its switching system in a distributed and "pay-as-you-grow" fashion without sacrificing scalability. The fabric switch's ability to respond to changing network conditions makes it an ideal solution in a virtual computing environment, where network loads often change with time.

[0037] It should also be noted that a fabric switch is distinct from a VLAN. A fabric switch can accommodate a plurality of VLANs. A VLAN is typically identified by a VLAN tag. In contrast, the fabric switch is identified by a fabric identifier (e.g., a cluster identifier), which is assigned to the fabric switch. Since a fabric switch can represented as a logical chassis, the fabric identifier can also be referred to as a logical chassis identifier. A respective member switch of the fabric switch is associated with the fabric identifier. In some embodiments, a fabric switch identifier is pre-assigned to a member switch. As a result, when the switch joins a fabric switch, other member switches identifies the switch to be a member switch of the fabric switch.

[0038] In this disclosure, the term "fabric switch" refers to a number of interconnected physical switches which form a single, scalable network of switches. The member switches of the fabric switch can operate as individual switches. The member switches of the fabric switch can also operate as a single switch in the provision and control plane, the data plane, or both. "Fabric switch" should not be interpreted as limiting embodiments of the present invention to a plurality of switches operating as a single, logical switch.

[0039] Although the present disclosure is presented using examples based on an encapsulation protocol, embodiments of the present invention are not limited to networks defined using one particular encapsulation protocol associated with a particular Open System

Interconnection Reference Model (OSI reference model) layer. For example, embodiments of the present invention can also be applied to a multi-protocol label switching (MPLS) network. In this disclosure, the term "encapsulation" is used in a generic sense, and can refer to encapsulation in any networking layer, sub-layer, or a combination of networking layers.

[0040] The term "end device" can refer to any device external to a network (e.g., does not perform forwarding in that network). Examples of an end device include, but are not limited to, a physical or virtual machine, a conventional layer-2 switch, a layer-3 router, or any other type of network device. Additionally, an end device can be coupled to other switches or hosts further away from a layer-2 or layer-3 network. An end device can also be an aggregation point for a number of network devices to enter the network. An end device hosting one or more virtual machines can be referred to as a host machine. In this disclosure, the terms "end device" and "host machine" are used interchangeably.

[0041] The term "VLAN" is used in a generic sense, and can refer to any virtualized network. Any virtualized network comprising a segment of physical networking devices, software network resources, and network functionality can be can be referred to as a "VLAN." "VLAN" should not be interpreted as limiting embodiments of the present invention to layer-2 networks. "VLAN" can be replaced by other terminologies referring to a virtualized network or network segment, such as "Virtual Private Network (VPN)," "Virtual Private LAN Service (VPLS)," or "Easy Virtual Network (EVN)."

[0042] The term "packet" refers to a group of bits that can be transported together across a network. "Packet" should not be interpreted as limiting embodiments of the present invention to layer-3 networks. "Packet" can be replaced by other terminologies referring to a group of bits, such as "frame," "cell," or "datagram." [0043] The term "switch" is used in a generic sense, and can refer to any standalone or fabric switch operating in any network layer. "Switch" can be a physical device or software running on a computing device. "Switch" should not be interpreted as limiting embodiments of the present invention to layer-2 networks. Any device that can forward traffic to an external device or another switch can be referred to as a "switch." Examples of a "switch" include, but are not limited to, a layer-2 switch, a layer-3 router, a TRILL RBridge, or a fabric switch comprising a plurality of similar or heterogeneous smaller physical switches.

[0044] The term "edge port" refers to a port on a network which exchanges data frames with a device outside of the network (i.e., an edge port is not used for exchanging data frames with another member switch of a network). The term "inter-switch port" refers to a port which sends/receives data frames among member switches of the network. A link between inter-switch ports is referred to as an "inter-switch link." The terms "interface" and "port" are used interchangeably.

[0045] The term "switch identifier" refers to a group of bits that can be used to identify a switch. Examples of a switch identifier include, but are not limited to, a media access control (MAC) address, an Internet Protocol (IP) address, an RBridge identifier, or a combination thereof. In this disclosure, "switch identifier" is used as a generic term, is not limited to any bit format, and can refer to any format that can identify a switch.

[0046] The term "tunnel" refers to a data communication where one or more networking protocols are encapsulated using another networking protocol. Although the present disclosure is presented using examples based on a layer-3 encapsulation of a layer-2 protocol, "tunnel" should not be interpreted as limiting embodiments of the present invention to layer-2 and layer-3 protocols. A "tunnel" can be established for and using any networking layer, sub-layer, or a combination of networking layers.

Network Architecture

[0047] FIG. 1A illustrates an exemplary IP-based fabric switch, in accordance with an embodiment of the present invention. As illustrated in FIG. 1A, a fabric switch 100 includes member switches 101, 102, 103, 104, and 105. Fabric switch 100 can be based on IP and a respective member switch, such as switch 105, can be an IP-capable switch, which calculates and maintains a local IP routing table (e.g., a routing information base or RIB), and is capable of forwarding packets based on its IP addresses. The routing table specifies routes within fabric switch 100. To populate the IP routing table, a respective member switch uses a routing protocol (e.g., OSPF-based routing protocol). In some embodiments, one or more switches in fabric switch 100 can be virtual switches (e.g., a software switch running on a computing device). Switches 101 and 104 are coupled to end devices 112 and 114, respectively.

[0048] Member switches in fabric switch 100 use edge ports to communicate with end devices and inter-switch ports to communicate with other member switches. For example, switch 101 is coupled to end device 101 via an edge port and to switches 102, 103, 104, and 105 via inter- switch ports. Communication between member switches via inter- switch ports can be based on IP, and communication between an end device and a member switch via an edge port can be based on Ethernet. For example, switch 104 receives an Ethernet frame from end device 114 via an edge port. Switch 104 then encapsulates the Ethernet frame in an IP header (e.g., a layer-3 tunnel header) and forwards the encapsulated packet to another member switch. It should be noted that the encapsulated packet can have an external Ethernet header for layer-2 forwarding.

[0049] A respective switch in fabric switch 100 is assigned a switch identifier, such as an IP address (e.g., an IP v4 or IP v6 address). A user (e.g., a network administrator) can assign the switch identifier to a respective member switch. For example, end device 112 can be an administrator workstation and the user can assign a switch identifier of a respective member switch from end device 112. A switch can also be dynamically assigned to a switch (e.g., using a Dynamic Host Configuration Protocol (DHCP) server). In some embodiments, from

provisioning perspective (e.g., assigning an IP address), end device 112 views fabric switch as a single logical entity, such as a logical chassis 110. A respective member switch can appear as an element in logical chassis 110. As a result, the user can configure fabric switch 100 from a single location (e.g., end device 112), and global configurations (i.e., the configurations applicable to a respective switch) can be automatically applied to a respective member switch. End device 112 can manage logical chassis 110 as a single manageable entity. In some embodiments, end device 112 can use command line interface (CLI) of a switch or a management protocol to manage logical chassis 110. Examples of a management protocol include, but are not limited to, Network Configuration Protocol (NETCONF) and RESTCONF.

[0050] Furthermore, a respective member switch is assigned a mapped identifier, which can be a switch index within fabric switch 100. This mapped identifier can also be locally generated in a switch based on the local switch identifier. The mapped identifier can operate as a "shortened" identifier for a switch. In the example in FIG. 1A, fabric switch 100 includes five member switches. A switch identifier for a respective switch in fabric switch 100 can be an IP address, which is 32 bits long for IP v4 or 128 bits long for IP v6. However, a number represented by three bits (e.g., integers 0-4) can identify the member switches in fabric switch 100. Hence, a three-bit long mapped identifier can be used to represent the member switches of fabric switch 100. In some embodiments, the number of bits dedicated for a mapped identifier in fabric switch 100 is determined based on the maximum number of member switches supported by fabric switch 100. For example, if fabric switch 100 supports a maximum 64 member switches, mapped identifier for fabric switch 100 should be at least six bits long.

[0051] In some embodiments, a port in fabric switch 100 is assigned a port identifier, which uniquely identifies the port in fabric switch 100. A port identifier in fabric switch 100 can be in a "mapped identifier/chassis number (e.g., line card number )/port number" format. If the mapped identifier of switch 101 is "X," and switch 101 has at least three chassis, one of which includes at least 16 ports, a port identifier of switch 101 can be "X/2/15." This identifier represents the sixteenth port of the third chassis of switch 101. Similarly, if the mapped identifier of switch 102 is "Y," and switch 102 also has at least three chassis, each of which includes at least 16 ports, a port identifier of switch 102 can be "Y/2/15." In this way, the mapped identifier in a port identifier distinguishes two ports having the same chassis and port number in fabric switch 100. If a switch is a "pizza box" switch with a single chassis, the chassis number in a port identifier can be "0."

[0052] During operation, a respective member switch of fabric switch 100 uses a link discovery protocol via its inter-switch links to discover a neighbor switch. Examples of a link discovery protocol include, but are not limited to, Link Layer Discovery Protocol (LLDP) and Brocade Link Discovery Protocol (BLDP). In some embodiments, an inter-switch link can be modeled as a point-to-point unnumbered interface to avoid IP address and/or mask configuration for a respective inter-switch link. For example, when respective IP addresses are assigned to switches 103 and 104 as switch identifiers, the inter-switch communication between switches 103 and 104 can be established as a point-to-point communication channel between the corresponding interfaces using the IP addresses. This allows auto discovery of neighbors in fabric switch 100 without configuring an individual IP address for a respective interface.

[0053] In some embodiments, fabric switch 100 is assigned a fabric identifier, which uniquely identifies fabric switch 100. The fabric identifier is assigned to a respective switch of fabric switch 100 (e.g., the user can configure from end device 112). Upon discovering each other, switches 101 and 103 determine that they have the same fabric identifier of fabric switch 100 and belong to the same fabric switch. This allows a member switch of fabric switch 100 to automatically detect other member switches and form fabric switch 100.

[0054] In some embodiments, inter-switch links in fabric switch 100 supports automatic formation of link aggregations. Suppose that three links couple switches 101 and 103. As a result, switch 101 discovers switch 103 via all three links, and vice versa. For example, switch 101 can receive a link discovery message (e.g., LLDP Data Unit (LLDPDU)) comprising the same switch identifier of switch 103 via the three links. Switch 101 then determines that switch 101 is coupled to switch 103 via those three links. Similarly, switch 103 also determines that switch 103 is coupled to switch 101 via three links. Switch 101 and 103 then automatically aggregate the links between them to form an inter-switch link aggregation group 130.

[0055] If a link in link aggregation group 130 becomes unavailable, other links can continue to operate. However, if the number of links becomes one, virtual link aggregation group 130 can become an individual link. If link aggregation group 130 becomes unavailable (e.g., due to multiple link failures or a node failure), switches 101 and 103 detect the unavailability, and generate and send a notification message notifying other member switches regarding the unavailability (e.g., in the payload of a notification message). Similarly, if the link between switches 104 and 105 becomes unavailable, switches 104 and 105 detect the unavailability, and generate and send a notification message. Upon receiving the notification message, other member switches run the routing protocol based on the updated adjacency to determine the updated paths.

[0056] In link aggregation group 130, the links can be coupled to one or more network interface cards (NICs). For example, if a set of links are coupled to a NIC in switch 101 are also coupled to a NIC in switch 103, the set of links form a link trunk 132. Link trunk 132 and individual non-trunk link 134 between switches 101 and 103 then form link aggregation group 130. It should be noted that a link aggregation group can include a combination of link trunks and individual links.

[0057] Upon forming the link aggregation groups for inter-switch links, a respective member switch in fabric switch 100 runs a routing protocol to discover adjacency in fabric switch 100. Examples of a routing protocol include, but are not limited to, Open Shortest Path First (OSPF) based routing protocols, distance vector based routing protocols, and a combination thereof. This routing protocol discovers one or more paths between the member switches. For example, switch 103 discovers that switch 105 is reachable via a path comprising switch 104, and can assign switch 104 as the next-hop switch for switch 105. If a link aggregation group exists between a member switch pair, the adjacency can be formed over that link aggregation group. For example, adjacency between switches 101 and 103 can be formed over link aggregation group 130.

[0058] In some embodiments, a respective switch of fabric switch 100 supports priority- based flow control (PFC). In this way, during packet forwarding within fabric switch 100, a member switch can provide a uniform quality of service (QoS) for the outer and inner layer-2 headers. During operation, switch 104 receives an Ethernet frame from end device 114. Switch 104 identifies the priority value associated with PFC in the Ethernet header and encapsulates the frame in an IP header and an outer Ethernet header. Switch 104 maps the identified priority value in the outer Ethernet header (e.g., a one-to-one mapping) and forwards the encapsulated packet based on its destination. Upon receiving the packet, any other member switch applies priority- based flow control based on the priority value in the outer Ethernet header to the packet.

[0059] In some embodiments, a member switch can share information with another member switch in fabric switch 100 based on a name service. Upon learning a MAC address, a member switch includes the learned MAC address in a pay load of a name service notification message and sends the message a respective other switches of fabric switch 100 via the name service. In this way, a respective member switch is aware of the locations of the end devices coupled with fabric switch 100. In some embodiments, the payload format of the notification message is the same regardless of the protocol based on which the name service. This allows the name service to be backward compatible. The name service can be implemented based on a scale protocol, such as ZeroMQ and NanoMsg.

[0060] FIG. IB illustrates an exemplary console for configuring an IP-based fabric switch, in accordance with an embodiment of the present invention. In this example, switch 101 has a console 150. Upon accessing (e.g., from end device 112), switch 101 presents console 150 to the user. In some embodiments, when the user accesses switch 101, console 150 provides a command line interface shell 152 to the user. The user can type commands to shell 152. Shell 152 can be the initial screen which appears when the user accesses switch 101. Suppose that fabric switch 100 has a fabric identifier 142, which identifies fabric switch 100 and is associated with a respective switch of fabric switch 100. Since fabric switch 100 can operate as a single logical chassis for the provision and control plane, the user can provide a command to shell 152 to gain access to fabric switch 100 as a logical chassis (e.g., "fabric-switch fabric-id 142"). This allows the user to provision fabric switch 100 as logical chassis 110.

[0061] The user then provides another command (e.g., "config terminal") to shell 152 to gain access to a configuration terminal for fabric switch 100 in shell 152. The user can use this configuration terminal to provide global configuration associated with fabric switch 100 and local configuration associated with any member switch in fabric switch 100. For example, if switch 101 has a switch MAC address 144, the user can issue a command to the terminal to map MAC address 144 to a switch identifier 146 (e.g., a switch IP address). Similarly, the user can issue another command to the terminal to map switch identifier 146 to a mapped identifier 148. In some embodiments, switch 101 can generate mapped identifier 148 from switch identifier 146 (e.g., without a user configuration). Mapped identifier 148 can also be pre-assigned to switch 101.

[0062] The user can issue a command to the terminal to create a VLAN 110. This VLAN 110 is created across fabric switch 100, and hence, is part of the global configuration. On the other hand, user can also issue a command to configure a specific port of a specific switch in fabric switch 100. This port configuration is a local configuration for that switch. In some embodiments, a port is identified by a port identifier, which can be in a "mapped

identifier/chassis number (e.g., line card number )/port number" format. For example, the user can configure a 10 Gigabit Ethernet port identified by port identifier "148/2/15." Since switch 101 is associated with mapped identifier 148, the port identifier indicates that the port is port number 16 in line card number 3 of switch 101. The user can add VLAN 110 to that specific port. This VLAN configuration of the port is a local configuration of switch 101.

[0063] FIG. 1C illustrates an exemplary configuration database for an IP-based fabric switch, in accordance with an embodiment of the present invention. As illustrated in FIG. 1C, a member switch of fabric switch 100 typically maintains two configuration tables that describe its instance: a fabric switch configuration database 180, and a default switch configuration table 184. Configuration database 180 describes the fabric switch configuration when a switch is part of fabric switch 100. Default switch configuration table 184 describes the switch's default configuration. Configuration database 180 includes a global configuration table (GT) 182, which includes a fabric switch identifier, such as fabric identifier 142 for fabric switch 100 (denoted as FABRIC_ID), and a VLAN list in fabric switch 100. Also included in configuration database 180 are a number of switch (or local) configuration tables (STs or LTs), such as ST0, ST1, and STn. Each ST includes the corresponding member switch's MAC address and the switch identifier, as well as the switch's interface details.

[0064] In some embodiments, when a switch joins fabric switch 100 for the first time, fabric switch 100 assigns a mapped identifier to the switch. For example, fabric switch 100 assigns a value of "0" to mapped identifier 148 of switch 101 and stores in corresponding ST0. This mapped identifier persists with switch 101, even if switch 101 leaves fabric switch 100. When switch 101 joins fabric switch 100 again at a later time, the same mapped identifier "0" is used by fabric switch 100 to retrieve previous configuration information for switch 101. This feature can reduce the amount of configuration overhead in fabric switch 100. Also, the persistent mapped identifier allows fabric switch 100 to "recognize" a previously configured member switch 101 when it re-joins fabric switch 100, since a dynamically assigned switch identifier can change each time switch 101 joins and is configured by fabric switch 100. [0065] Default switch configuration table 184 has an entry for the mapped identifier that points to the corresponding ST in configuration database 180. Note that configuration database 180 is replicated and distributed to all switches in fabric switch 100. Default switch

configuration table 184 is local to a particular member switch.

[0066] The "IN_FABRIC" value in default switch configuration table 184 indicates whether the member switch is part of a fabric switch. A switch is considered to be "in a fabric switch" when it is assigned one of the switch identifiers by a fabric switch. When a switch is first connected to fabric switch 100, fabric switch formation process allocates a new switch identifier to the joining switch. In one embodiment, only the switches directly connected to the new switch participate in the join operation.

[0067] Note that in the case where the global configuration database of a joining switch is current and in sync with the global configuration database of fabric switch 100 based on a comparison of the transaction identifiers of the two databases (e.g., when a member switch is temporarily disconnected from fabric switch 100 and re-connected shortly afterward), a trivial merge is performed. That is, the joining switch can be connected to fabric switch 100, and no change or update to the global configuration database is required.

[0068] FIG. ID illustrates exemplary logical units in an IP-based fabric switch, in accordance with an embodiment of the present invention. The logical building block of fabric switch 100 can be a single switch or a group of switches. The selection of a building block can be based on the customer requiring a switch-level high availability. To facilitate high availability among member switches of fabric switch 100, a plurality of switches can form a logical unit, and a logical unit can operate as a single logical member of fabric switch. In some embodiments, switches 103 and 104 can form a logical unit 162, and switches 101 and 102 can form another logical unit 164. A fabric switch can include a combination of logical units and standalone member switches. In the example in FIG. ID, fabric switch 100 includes logical units 162 and 164, and standalone member switch 105. In some embodiments, standalone switch 105 can operate as a logical unit.

[0069] A respective switch can be assigned a logical unit identifier, which identifies to which logical unit the switch belongs. For example, switches 103 and 104 can have the same logical unit identifier, which identifies logical unit 162. During operation, a switch discovers other switches with the same fabric identifier and logical identifier, and automatically forms a logical unit with the discovered switches. The switches in a logical unit can operate in an active- standby mode (one switch remains active and others are in standby mode), or in an active-active mode (a respective switch receives and forwards traffic). Tunnel-Based Fabric Encapsulation

[0070] In some embodiments, a respective member switch forwards traffic in fabric switch 100 based on an encapsulation header (e.g., a tunnel encapsulation header). FIG. 2A illustrates exemplary inter-switch tunnels in an IP -based fabric switch, in accordance with an embodiment of the present invention. Suppose that fabric switch 100 is coupled with a layer-3 core network 200 via switches 101 and 102. For forwarding traffic within fabric switch 100, all member switches do not need to participate in the routing protocol of network 200 (e.g., the overlay routing protocol). Switches 101 and 102 can participate in the routing protocol of network 200 and operate as gateways. It should be noted that a respective member switch, including switches 101 and 102, participate in the internal (or underlay) routing protocol of fabric switch 100 for IP-based forwarding within fabric switch 100. The internal routing protocol interfaces with network 200. The internal routing protocol advertises the subnets associated with the overlay network as connected routes to network 200.

[0071] In some embodiments, inter-switch packet forwarding in fabric switch 100 is based on tunnel encapsulation. Examples of a tunnel encapsulation protocol include, but are not limited to, Virtual Extensible Local Area Network (VXLAN), Generic Routing Encapsulation (GRE), and its variations, such as Network Virtualization using GRE (NVGRE) and openvSwitch GRE. A respective switch of fabric switch 100 establishes a tunnel with a respective other member switch (e.g., a full mesh of tunnels). For example, switch 103 establishes tunnels 212, 214, 216, and 218 with switches 101, 102, 104, and 105 respectively. If a tunnel spans an inter- switch link aggregation, that tunnel can use all links in the link aggregation to forward traffic. For example, tunnel 212 can use all links in link aggregation 130.

[0072] It should be noted that, even though inter-switch packet forwarding in fabric switch 100 is based on a tunnel encapsulation header, fabric switch 100 represents itself as a single logical tunnel end point (TEP) 210 to external end devices. Logical tunnel end point 210 can be associated with a virtual IP address. For example, if end device 204 establishes a tunnel with fabric switch 100 via network 200, fabric switch 100 represents itself as logical tunnel end point 210. If end device 204 sends a packet via the tunnel, end device 204 encapsulates the packet in a tunnel encapsulation header with the virtual IP address as the egress address

[0073] One or more member switches of fabric switch 100 can participate in logical tunnel end point 210. The virtual IP address can be assigned to the member switches

participating in logical tunnel end point 210. If any of these member switches receive the packet, that member switch considers the packet to be destined to the local switch and decapsulates the encapsulation header. In this way, fabric switch 100 can operate as a single logical tunnel end point for the tunnels established via network 200.

[0074] The virtual IP address associated with logical tunnel end point 210 also allows fabric switch 100 to operate as a single layer-3 gateway (e.g., a gateway router). For example, end device 114 can use the virtual IP address as the gateway IP address for all its communication. As a result, when end device 114 initiates any communication, end device 114 can issue an Address Resolution Protocol (ARP) query. When the query reaches a switch participating in logical tunnel end point 210, the switch responds with an ARP reply comprising a virtual MAC address mapped to the virtual IP address. A switch participating in logical tunnel end point 210 can maintain such mapping in a local storage device.

[0075] Upon receiving the reply, end device 114 uses the virtual MAC address as the destination MAC address for its subsequent communication. In some embodiments, one of the switches participating in logical tunnel end point 210 can be elected to respond to a ARP query for the virtual IP address. All other switches forward ARP query to the elected switch. If the elected with fails, another switch can be elected. In some embodiments, the switch with the highest (or lowest) switch identifier value is elected for responding to ARP queries for the virtual IP address.

[0076] If fabric switch 100 has a full mesh of tunnels for a respective member switch, fabric switch 100 does not need to maintain a separate routing protocol for forwarding via the tunnels. For example, since switch 103 has tunnel 218 with switch 105, a packet encapsulated in a tunnel header with the switch identifiers of switches 103 and 105 as source and destination addresses, respectively, can be forwarded based on routing information of fabric switch 100. Switch 104 can receive such a packet, checks its local forwarding information, and forward the packet to switch 105. As a result, packet forwarding between any two member switches can entirely based on tunnel encapsulation. An intermediate switch, such as switch 104, does not need to decapsulate the tunnel encapsulation header.

[0077] However, a full mesh of tunnels can lead to a large number of tunnels. For example, with five member switches of fabric switch 100, the number of tunnels is twenty five. Furthermore, for broadcast, unknown unicast, and multicast (BUM) traffic, a packet is replicated for a respective tunnel. In some embodiments, tunnels in fabric switch 100 follow the physical topology of fabric switch 100. For example, switch 103 does not establish tunnel 218 with switch 105 since switch 103 is not directly coupled with switch 105. As a result, a packet from switch 103 to switch 105 is forwarded via tunnel 216 to switch 104. Upon receiving the packet, switch 104 decapsulates the encapsulation header and examines the inner MAC address to determine switch 105 to be the egress switch for the packet. Switch 104 then re-encapsulates the packet in another tunnel header associated with tunnel 222 with the switch identifiers of switches 104 and 105 as the source and destination addresses, respectively.

[0078] This also allows fabric switch 100 to have a distribution tree of tunnels for the distribution of BUM traffic. For example, if switch 101 is the root switch of the tree, upon receiving a packet belonging to BUM traffic, switch 103 forwards the packet via tunnel 212 to switch 101. Switch 101 receives the packet, decapsulates the tunnel encapsulation header, and examines the inner MAC address to determine the packet to be in BUM traffic. Hence, switch 101 uses the distribution tree to forward the packet. Switch 101 re-encapsulates the packet in another tunnel header associated with the tree with the switch identifier of switch 101 as the source address, and a multicast address associated with the tree as the destination address.

[0079] If a switch is in a logical unit, all switches of that logical unit can operate as a virtual single tunnel end point in fabric switch 100. For example, logical unit 162 can have a single virtual IP address, which can be used as the tunnel end point for logical unit 162. Switches 103 and 104 can initiate or terminate forwarding of a packet encapsulated in a tunnel

encapsulation using that virtual IP address. In some embodiments, the scope of such

encapsulation and termination is limited within fabric switch 100 (e.g., the tunnel encapsulation header does not leave fabric switch 100). A respective switch of the logical unit can individually operate as a tunnel end point as well. For example, both switches 103 and 104 can have their respective tunnels even though they are in logical unit 162.

[0080] In some embodiments, a respective member switch periodically determines whether a tunnel is operational. For example, a switch can use Bi-directional Forwarding Detection (BFD) in the tunnels for detection of unavailability associated with a link, link aggregation, or switch. Upon detecting the unavailability, the switch generates a notification message, which indicates the type of unavailability, and sends the notification message to other member switches. Since such unavailability changes the adjacency of switches, the internal routing protocol of a respective member switch re-computes the paths in fabric switch 100.

[0081] FIG. 2B illustrates an exemplary tunnel encapsulation header for an IP-based fabric switch, in accordance with an embodiment of the present invention. A tunnel

encapsulation header can include a tunnel header 260 in addition to outer IP and outer Ethernet headers. Tunnel header 260 can include a tunnel identifier 266, which identifies the

corresponding tunnel. If the tunnel encapsulation in fabric switch 100 is based on VXLAN, tunnel header 260 can be an enhanced VXLAN header and tunnel identifier 266 can be a 24-bit long VXLAN Network Identifier (VNI). [0082] In a VXLAN header, before the VNI, 32 bits are reserved for additional usage. In some embodiments, these 32 bits are used to represent a learning label 262 and a forwarding label 264, 16 bits each, in tunnel header 260. The 8 bits after the VNI can remain as reserved bits 268. These enhancements allows fabric switch 100 to support but virtual link aggregation groups, as described in conjunction with FIG. 5. Furthermore, the tunnels in fabric switch 100 can be agnostic to underlying topology.

Network Extensions

[0083] In some embodiments, fabric switch 100 participates in a software-defined network (SDN). FIG. 3A illustrates an exemplary IP-based fabric switch participating in a software defined network, in accordance with an embodiment of the present invention. In this example, fabric switch 100 can be a heterogeneous software-defined network, which can include one or more switches capable of processing rules and configurations provided by a controller (such as those defined using OpenFlow). These one or more switches can be referred to as software-definable switches. A controller 310 is logically coupled to a respective software- definable switch in fabric switch 100 via a network 300 (e.g., a layer-2 or layer-3 network).

Controller 310 can be physically coupled to a subset of the switches.

[0084] One or more services supported by fabric switch 100 can be configured from controller 310. Controller 310 can view fabric switch 100 as a logical chassis and provide configuration for the logical chassis. Upon receiving such configuration from controller 310, the receiving switch can distribute the configuration information to the member switches. For example, controller 310 can provide a set of port profiles for fabric switch 100. Upon receiving the port profiles from controller 310, the receiving member switch distributes the port profiles to other member switches. A port profile includes one or more MAC address of end devices, and specifies a set of configurations for a port (e.g., QoS, VLAN, and security configurations). If a switch detects a MAC address of a packet in a port profile, the switch applies the configurations of that port profile to the ingress port (and/or egress port) of the packet.

[0085] Similarly, fabric switch 100 can receive internal routing information (e.g., paths between member switches) from controller 310. This allows a member switch to establish a tunnel and forward traffic via the tunnel to another member switch without running a routing protocol within fabric switch 100. Controller 310 can also configure a global VLAN in fabric switch 100. A global VLAN is a virtualized network in fabric switch 100 and corresponds to a customer VLAN. A global VLAN identifier can be 24 bits long and its scope can be limited within fabric switch 100. Controller 310 can provide a mapping between the global VLAN and the customer VLAN.

[0086] FIG. 3B illustrates exemplary Fibre Channel (FC) gateways in an IP-based fabric switch, in accordance with an embodiment of the present invention. In this example, switches 101 and 102 can form an FC gateway 360, and are coupled to FC storage area network (SAN) 350. In some embodiments, network 350 is an FC fabric and includes FC router 352. One or more target storage devices can be coupled to FC router 352. FC fabric 350 is dedicated to provide access to data blocks from the targets.

[0087] Switch 101 and 102 can present FC router 352 as virtual switch 352 to switches 103, 104, and 105. In some embodiments, mapped identifiers in fabric switch 100 are in the same format as the domain identifier of FC routers 352. Switches 101 and 102 advertise the domain identifier of FC router 352 as the mapped (or switch) identifier of virtual switch 352. In this way, switches 101 and 102 can forward FC over Ethernet (FCoE) traffic from an end device (e.g., end device 114) to an FC domain, thereby extending the domain of network 150 to the domain of fabric switch 100. As a result, a single routing protocol instance in a respective switch in fabric switch 100 can make routing decisions for tunnel-encapsulated Fibre Channel or non- Fibre Channel packets.

Fabric Formation

[0088] FIG. 4 presents a flowchart illustrating the fabric-formation process of a member switch in an IP-based fabric switch, in accordance with an embodiment of the present invention. During operation, the switch obtains a switch identifier (e.g., an IP address) for the local switch (operation 402). The switch can obtain this switch identifier from a user or an identifier allocation service (e.g., DHCP). The switch then obtains a mapped identifier for the local switch (operation 404). The switch can also generate the mapped identifier based on the switch identifier or based on an indexing service. The switch then discovers the neighbor switches based on a link discovery protocol (e.g., LLDP) (operation 406).

[0089] The switch establishes link aggregation for the links coupling the same neighbor switch (operation 408). In the example in FIG. 1A, switch 101 forms a link aggregation 130 for the links coupling switch 103. The switch establishes point-to-point communication with a respective neighbor switch over inter- switch link using unnumbered layer-3 interface (operation 410). The switch determines adjacency (e.g., paths) within the fabric switch using a routing protocol (e.g., a variation of OSPF) (operation 412). The switch then establishes tunnels with other member switches based on mesh preference of the fabric switch (operation 414). For example, if the mesh preference indicates a full mesh, the switch establishes a tunnel with a respective other member switch. On the other hand, if the mesh preference indicates a partial mesh, the switch establishes a tunnel with a subset of other member switches (e.g., directly coupled member switches).

[0090] In some embodiments, the switch configures the switch identifier (e.g., the IP address) as the domain identifier for the local switch (operation 416). This facilitates additional compatibility for the switch to operate in both IP-based and domain identifier based fabric switch. The switch then checks whether the switch supports operating in a software defined network (operation 418). If so, the switch obtains service/feature configuration from a controller of the software defined network (operation 420), as described in conjunction with FIG. 3A. Otherwise, the switch initiates distributed service/feature configuration (operation 422). For example, the switch can initiate sharing of configuration information of a respective member switch, as described in conjunction with FIG. 1C. Virtual Link Aggregation

[0091] FIG. 5 illustrates an exemplary virtual link aggregation group in an IP-based fabric switch, in accordance with an embodiment of the present invention. As illustrated in FIG. 5, end device 512 and 516 are both dual-homed and coupled to switches 103 and 104. The goal is to allow a dual-homed end device to use physical links to at least two separate switches as a single, logical aggregate link, with the same media access control (MAC) address. Such a configuration would achieve true redundancy and facilitate fast protection switching. A link aggregation group coupling a device to at least two other devices can be referred to as a virtual link aggregation group.

[0092] Switches 103 and 104 are configured to operate in a special "trunked" mode for end devices 512 and 514, thereby forming a virtual link aggregation group 530. End devices 514 and 514 view switches 103 and 104 as a common virtual switch 502, with a corresponding virtual switch identifier. Dual-homed end devices 512 and 514 are considered to be logically coupled to virtual switch 502 via logical links (denoted by dotted lines). Virtual switch 502 is considered to be logically coupled to both switches 103 and 104, optionally with zero-cost links (represented by dashed lines). Other switches may view end devices 512 and 514 to be coupled with virtual switch 502.

[0093] Among the links in a link trunk, one link is selected to be a primary link. For example, the primary link for end device 512 can be the link to switch 103. Switches

participating in a virtual link aggregation group and forming a virtual switch are referred to as "partner switches." Operation of virtual switches for multi-homed end devices is specified in U.S. Patent Application No.12/725,249, Attorney Docket No. BRCD-112-0439US, entitled "Redundant Host Connection in a Routed Network," the disclosure of which is incorporated herein in its entirety.

[0094] In a typical layer-3 redundancy protocol (e.g., Virtual Router Redundancy Protocol (VRRP)), a dual-homed end device should support layer-3 communication. On the other hand, an end device coupling a virtual link aggregation group can be in layer-2. In the example in FIG. 5, end device 514 can be a layer-2 networking device (e.g., an Ethernet switch) coupling another end device 516. Since switches 103 and 104 can be in logical unit 162, virtual link aggregation group 530 can facilitate high availability within logical unit 162 for end devices 512 and 514. Furthermore, switches 103 and 104 can initiate and terminate tunnel encapsulation based on the virtual switch identifier of virtual switch 502. Upon receiving a packet from end device 512, switch 103 (or switch 104) encapsulates the packet in a tunnel encapsulation header and includes an identifier for virtual link aggregation group 530 in the tunnel encapsulation header, as described in conjunction with FIG. 2B. Furthermore, switch 103 can assign the virtual switch identifier of virtual switch 502 as the source address in the tunnel encapsulation header.

Exemplary Switch

[0095] FIG. 6 illustrates an exemplary member switch in an IP-based fabric switch, in accordance with an embodiment of the present invention. In this example, a switch 600 includes a number of communication ports 602, a packet processor 610, a logical chassis apparatus 630, and a storage device 650. In some embodiments, packet processor 610 adds an encapsulation header to a packet.

[0096] In some embodiments, logical chassis apparatus 630 maintains a membership in a fabric switch, which can represent itself as a logical chassis. Logical chassis apparatus 630 maintains a configuration database in storage device 650 that maintains the configuration state of a respective switch within the logical chassis, as described in conjunction with FIG. 1C. Logical chassis apparatus 630 maintains the state of the fabric switch and the logical chassis, which is used to join other switches. Under such a scenario, communication ports 602 can include inter- switch communication channels for communication within the fabric switch. This inter-switch communication channel can be implemented via a regular communication port and based on any open or proprietary format (e.g., IP protocol).

[0097] Logical chassis apparatus 630 facilitates formation of the fabric switch represented as the logical chassis, as described in conjunction with FIG. 4. Logical chassis apparatus 630 also allows switch 600 to participate in the name service of the fabric switch. If switch 600 is in a logical unit, logical unit apparatus 632 maintains the logical unit for switch 600, as described in conjunction with FIG. ID. Tunnel management apparatus 622 maintains tunnels with other member switches, as described in conjunction with FIG. 2A. In some embodiments, switch 600 includes a tunnel end point apparatus 666, which operates the logical chassis as an end point for an external tunnel (i.e., the tunnel's other end point is outside of the logical chassis). If switch 600 supports software defined network, SDN apparatus 640 facilitates configuration of switch 600 based on rules and configurations from a controller, as described in conjunction with FIG. 3A. Link aggregation apparatus 624 supports virtual link aggregation groups in switch 600.

[0098] Note that the above-mentioned modules and apparatuses can be implemented in hardware as well as in software. In one embodiment, these modules and apparatuses can be embodied in computer-executable instructions stored in a memory which is coupled to one or more processors in switch 600. When executed, these instructions cause the processor(s) to perform the aforementioned functions.

[0099] In summary, embodiments of the present invention provide a switch and a method for facilitating an IP-based logical chassis. In one embodiment, the switch includes a logical chassis apparatus and a tunnel apparatus. The logical chassis apparatus associates a logical chassis identifier of a logical chassis with the switch and assigns an Internet Protocol (IP) address as switch identifier of the switch. The logical chassis includes a plurality of member switches and the switch is a member switch of the logical chassis. The IP address uniquely identifies the switch in the logical chassis. The tunnel apparatus establishes a tunnel with a remote switch in the logical chassis. An inter-switch packet from the switch is encapsulated in a tunnel header associated with the tunnel.

[00100] The methods and processes described herein can be embodied as code and/or data, which can be stored in a computer-readable non-transitory storage medium. When a computer system reads and executes the code and/or data stored on the computer-readable non- transitory storage medium, the computer system performs the methods and processes embodied as data structures and code and stored within the medium.

[00101] The methods and processes described herein can be executed by and/or included in hardware modules or apparatus. These modules or apparatus may include, but are not limited to, an application- specific integrated circuit (ASIC) chip, a field-programmable gate array (FPGA), a dedicated or shared processor that executes a particular software module or a piece of code at a particular time, and/or other programmable-logic devices now known or later developed. When the hardware modules or apparatus are activated, they perform the methods and processes included within them.

[00102] The foregoing descriptions of embodiments of the present invention have been presented only for purposes of illustration and description. They are not intended to be exhaustive or to limit this disclosure. Accordingly, many modifications and variations will be apparent to practitioners skilled in the art. The scope of the present invention is defined by the appended claims.