Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
KEEPALIVE SCHEDULER IN A NETWORK DEVICE
Document Type and Number:
WIPO Patent Application WO/2017/210209
Kind Code:
A1
Abstract:
A network device may execute a process (e.g., a software keepalive process (SKAP)) that schedules the transmission of keepalive messages or packets. The network device maintains a database of keepalive network sessions storing information that is used for scheduling the transmission of the keepalive messages or packets for the keepalive network sessions. The database may be read and a next transmission time and session frequency for one or more keepalive sessions may be determined. The one or more keepalive sessions may then be placed in appropriate banks within a timer queue based on the determined next transmission time and session frequency. Each bank is associated with a time period from the current time. The keepalive sessions having sooner next transmission times are placed in higher priority banks. The scheduler may allow for real-time scheduling of the one or more keepalive sessions.

Inventors:
DUTTA RAJIB (US)
LI MICHAEL (US)
MANETI ARAVINDU (IN)
Application Number:
PCT/US2017/035032
Publication Date:
December 07, 2017
Filing Date:
May 30, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BROCADE COMM SYSTEMS INC (US)
International Classes:
H04L12/26; H04L29/08
Foreign References:
US20150304201A12015-10-22
US20140112318A12014-04-24
US6370656B12002-04-09
US20120151085A12012-06-14
US20070234332A12007-10-04
Attorney, Agent or Firm:
DAVE, Urmil V. et al. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS: 1. A method comprising:

reading a database configured to store data for one or more keepalive network sessions for a network device;

determining, based on reading the database, a next transmission time for one of the one or more keepalive network sessions based on a previous transmission time for the one of the one or more keepalive network sessions and a keepalive network session frequency for the one of the one or more keepalive network sessions;

placing, based at least in part on the determined next transmission time for the one of the one or more keepalive network sessions, a session identifier for the one of the one or more keepalive network sessions in a first bank of a plurality of banks within a timer queue, the first bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted within a first time period from the current time, the plurality of banks further comprising a second bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted after the first time period and within a second time period from the current time; and

transmitting, via a packet processor of the network device, a keepalive packet within the first period of time from the current time for the one of the one or more keepalive network sessions placed in the first bank. 2. The method of claim 1, wherein the database is accessible by a first virtual machine and a second virtual machine being executed by a line card on the network device. 3. The method of claim 2, wherein at least one of the first virtual machine or second virtual machine stores data for the one or more keepalive network sessions in the database. 4. The method of claim 2, wherein:

the first virtual machine operates in an active mode and performs a set of functions to facilitate forwarding of data packets from the network device; and

the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, wherein the second virtual machine does not perform the set of functions when operating in the standby mode.

5. The method of claim 1, wherein the first time period is ten

milliseconds. 6. The method of claim 1, wherein the plurality of banks comprises ten banks. 7. The method of claim 1, wherein if the next transmission time is before the current time, the session identifier for the one of the one or more keepalive network sessions is placed in a highest priority bank. 8. A network device comprising:

a database configured to store data for one or more keepalive network sessions for the network device;

one or more processors executing a keepalive subsystem process, wherein the keepalive subsystem process is configured to:

read the database;

determine, based on reading the database, a next transmission time for one of the one or more keepalive network sessions based on a previous transmission time for the one of the one or more keepalive network sessions and a keepalive network session frequency for the one of the one or more keepalive network sessions;

place, based at least in part on the determined next transmission time for the one of the one or more keepalive network sessions, a session identifier for the one of the one or more keepalive network sessions in a first bank of a plurality of banks within a timer queue, the first bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted within a first time period from the current time, the plurality of banks further comprising a second bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted after the first time period and within a second time period from the current time; and

transmit, via a packet processor of the network device, a keepalive packet within the first period of time from the current time for the one of the one or more keepalive network sessions placed in the first bank. 9. The network device of claim 8, wherein the database is accessible by a first virtual machine and a second virtual machine being executed by a line card on the network device.

10. The network device of claim 9, wherein at least one of the first virtual machine or second virtual machine stores data for the one or more keepalive network sessions in the database. 11. The network device of claim 9, wherein:

the first virtual machine operates in an active mode and performs a set of functions to facilitate forwarding of data packets from the network device; and

the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, wherein the second virtual machine does not perform the set of functions when operating in the standby mode. 12. The network device of claim 8, wherein the first time period is ten milliseconds. 13. The network device of claim 8, wherein the plurality of banks comprises ten banks. 14. The network device of claim 8, wherein if the next transmission time is before the current time, the session identifier for the one of the one or more keepalive network sessions is placed in a highest priority bank. 15. One or more non-transitory computer-readable media storing computer-executable instructions that, when executed by one or more computing devices, cause the one or more computing devices to:

read a database configured to store data for one or more keepalive network sessions for a network device;

determine, based on reading the database, a next transmission time for one of the one or more keepalive network sessions based on a previous transmission time for the one of the one or more keepalive network sessions and a keepalive network session frequency for the one of the one or more keepalive network sessions;

place, based at least in part on the determined next transmission time for the one of the one or more keepalive network sessions, a session identifier for the one of the one or more keepalive network sessions in a first bank of a plurality of banks within a timer queue, the first bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted within a first time period from the current time, the plurality of banks further comprising a second bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted after the first time period and within a second time period from the current time; and

transmit, via a packet processor of the network device, a keepalive packet within the first period of time from the current time for the one of the one or more keepalive network sessions placed in the first bank. 16. The one or more non-transitory computer-readable media of claim 15, wherein the database is accessible by a first virtual machine and a second virtual machine being executed by a line card on the network device. 17. The one or more non-transitory computer-readable media of claim 16, wherein at least one of the first virtual machine or second virtual machine stores data for the one or more keepalive network sessions in the database. 18. The one or more non-transitory computer-readable media of claim 16, wherein:

the first virtual machine operates in an active mode and performs a set of functions to facilitate forwarding of data packets from the network device; and

the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, wherein the second virtual machine does not perform the set of functions when operating in the standby mode. 19. The one or more non-transitory computer-readable media of claim 15, wherein the first time period is ten milliseconds. 20. The one or more non-transitory computer-readable media of claim 15, wherein the plurality of banks comprises ten banks. 21. The one or more non-transitory computer-readable media of claim 15, wherein if the next transmission time is before the current time, the session identifier for the one of the one or more keepalive network sessions is placed in a highest priority bank. 22. A method comprising:

reading a database configured to store data for one or more keepalive network sessions for a network device; determining, based on reading the database, a next transmission time for one of the one or more keepalive network sessions based on a previous transmission time for the one of the one or more keepalive network sessions and a keepalive network session frequency for the one of the one or more keepalive network sessions; and

placing, based at least in part on the determined next transmission time for the one of the one or more keepalive network sessions, a session identifier for the one of the one or more keepalive network sessions in a first bank of a plurality of banks within a timer queue, the first bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted within a first time period from the current time, the plurality of banks further comprising a second bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted after the first time period and within a second time period from the current time. 23. The method of claim 22, further comprising transmitting, via a packet processor of the network device, a keepalive packet within the first period of time from the current time for the one of the one or more keepalive network sessions placed in the first bank. 24. The method of any one of claims 22-23, wherein the database is accessible by a first virtual machine and a second virtual machine being executed by a line card on the network device. 25. The method of any one of claims 22-24, wherein the first time period is ten milliseconds.

Description:
KEEPALIVE SCHEDULER IN A NETWORK DEVICE CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims benefit and priority of Indian Provisional Application No. 201641018590, filed May 31, 2016, entitled“KEEPALIVE TECHNIQUE IN A NETWORK DEVICE.” The entire content of the 201641018590 application is incorporated herein by reference for all purposes. BACKGROUND

[0002] In computer networking, keepalive (KA) messages or packets (also sometimes referred to as hello messages) are commonly used for a variety of different purposes including to check connectivity and the health of network devices. For example, a particular network device may transmit keepalive messages to other network devices (e.g., to the neighbors of the particular network device) at regular time intervals. A network device receiving the keepalive messages may use the messages to determine the health of the sender of the messages and also to check connectivity to the sender of the messages (e.g., check whether a link between the particular network device and the network device receiving the messages is operational), and the like. If a network device, such as a router, stops receiving keepalive messages from a neighbor, after a set period (sometimes referred to as the dead interval), the router may assume the neighbor network device has gone down or there is something wrong with the connectivity to the neighbor network device, and take responsive actions. For example, if the recipient network device determines that a link is down due to not receiving keepalive messages from a particular network device, the recipient network device may use a different path to route data until the link is up again.

[0003] A network device may receive and transmit different types of keepalive messages corresponding to different protocols that involve sending of keepalive messages. Examples of protocols that involve transmission of keepalive messages at regular intervals include Intermediate System - Intermediate System (IS-IS), Resource Reservation Protocol (RSVP), Multiple Spanning Tree Protocol (MSTP), Link Aggregation Control Protocol (LACP), Open Shortest Path First (OSPF), Unidirectional Link Detection (UDLD), Generic Routing Encapsulation (GRE), Rapid Spanning Tree Protocol (RSTP), and others. A network device may open and maintain a session (“keepalive network session”) to facilitate the transmission of keepalive messages. Different such keepalive network sessions may be opened and maintained by a network device for different protocols. Several of the sessions may be maintained in parallel. For each session, the network device is configured to transmit keepalive messages at regular pre-defined time intervals specified by the protocol associated with the session. A keepalive message transmitted for a session may identify the associated protocol and may also comprise information identifying the session for which the message has been transmitted.

[0004] As indicated above, keepalive messages for a session have to be sent at predefined time intervals, where the duration of the time interval is typically defined by the keepalive protocol corresponding to that session. For example, for the OSPF protocol, keepalive messages have to be transmitted every ten seconds. As another example, for the IS-IS protocol, keepalive messages have to be transmitted every ten seconds. For some other protocols, keepalive messages may have to be transmitted every second.

[0005] As networks have gotten faster and for detecting network problems faster, the time intervals for sending keepalive messages have gotten shorter. These periodic time intervals can be in the order of milliseconds (msecs) or even faster. For example, for the UDLD protocol, the periodic time interval is 500 msecs. In another example, some protocols may have a periodic time interval of 100 msecs. Such reduced time intervals are becoming problematic for network device that are not capable of handling the transmission of keepalive messages within such short time intervals.

[0006] The problem is further compounded for network devices that provide high availability (HA) by supporting non-stop routing (NSR) and/or non-stop forwarding (NSF). In such a network device, the data forwarding or routing functionality provided by the network device is expected to continue without much impact even when the network device experiences certain events (e.g., a soft reboot, software upgrade, certain component failures) that impact the functionality of the network device. Such NSR or NSF functionality is typically provided using redundant subsystems. In a typical setup, a network device provides redundant subsystems for performing data forwarding or routing functions that are configured to operate according to the active-standby model of operation. In such implementations, one of the subsystems operates in an“active” mode and performs a set of networking functions while the other subsystem operates in a“standby” mode in which the set of functions performed by the subsystem operating in the active mode are not performed. In response to certain events, a failover or switchover may occur that causes the subsystem previously operating in the standby mode prior to the failover to start operating in the active mode and take over performance of the functions performed in active mode. The previous subsystem operating in active mode may operate in the standby mode. This enables the set of networking functions performed by the network device to continue to be performed without significant interruption.

[0007] In conventional network devices, transmission of keepalive messages is handled by the subsystem operating in standby mode. However, the failover or switchover itself may take a few seconds or even a few minutes. During this time period keepalive message may not be sent by the network device until the new active subsystem becomes fully functional (because the previous active subsystem is no longer active and the previous standby subsystem is in the process of being“brought up” in active mode). This can be problematic, for example, for keepalive protocols requiring keepalive messages to be sent in time intervals in the order of milliseconds. This may cause one or more devices in the network receiving the keepalive messages to incorrectly assume that a particular keepalive network session is no longer active or has been dropped, that the sender network device is down or a link is no longer operating. BRIEF SUMMARY

[0008] The present disclosure relates generally to networking technologies, and more particularly to mechanisms for sending keepalive messages or packets. More specifically, the present disclosure relates to a network device that is configured to send uninterrupted non- stop keepalive messages or packets for multiple keepalive network sessions.

[0009] A network device may execute a process (e.g., a software keepalive process (SKAP)) that enables the network device to continue to send keepalive messages or packets without interruption even during events such as a subsystem switchover or an in-place system upgrade. The network device maintains a database of keepalive network sessions storing information that is used to schedule and send keepalive messages or packets. The database may be shared between multiple subsystems and programs executed by the network device. In certain embodiments, the database may be updated by a subsystem executed by the processor and the information may then be used by the SKAP to schedule and send out keepalive messages or packets. The shared database may be highly scalable and flexible in order to allow a variety of protocols to be supported both presently and in the future.

[0010] A majority of networking protocols are session based, which means peers of protocols exchange keepalive messages or packets (“heartbeats”) to establish and allow continuation of the connectivity amongst them. Failure of such keepalive messages or packets may result in session disconnect and cleanup of the session so that there is a finite set of connectivity. Thus, the SKAP may ensure, unless a protocol decides otherwise, that the keepalive messages or packets have to continue at all time, e.g. during included software upgrades (a.k.a In Service Software Upgrade (ISSU)), planned Active-Standby switchover, unplanned failover, etc.

[0011] A system of one or more computers can be configured to perform particular operations or actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions. One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions. One general aspect includes a network device including: a database configured to store data for one or more keepalive network sessions for the network device. The network device also includes one or processors executing a first virtual machine and a keepalive subsystem process; where the first virtual machine is configured to: perform a set of functions to facilitate forwarding of data packets from the network device; and store information for a first keepalive network session in the database, the information for the first keepalive network session including information identifying a keepalive protocol for the first keepalive network session and information identifying a time interval period for transmitting keepalive packets for the first keepalive network session; and where the keepalive subsystem process is configured to: access the information for the first keepalive network session from the database; and based at least in part on the information for the first keepalive network session stored in the database, schedule transmission of one or more keepalive packets for the first keepalive network session from the network device. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods. [0012] Implementations may include one or more of the following features. The network device where: the one or more processors execute a second virtual machine. The network device may also include the first virtual machine operates in an active mode and performs the set of functions. The network device may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The network device further including: a set of one or more line cards. The network device may also include a backplane enabling communications between the set of line cards. The network device may also include where the one or more processors and the database are located on a first line card from the set of line cards. The network device further including a packet processor configured to transmit the scheduled one or more keepalive packets from the network device via one or more ports of the network device. The network device where the keepalive subsystem process is, based at least in part on the information for the first keepalive network session stored in the database, configured to: determine a transmission time of a most recent transmitted keepalive packet for the first keepalive network session. The network device may also include determine a keepalive transmission frequency for the first keepalive network session. The network device may also include determine, based upon the transmission time of the most recent transmitted keepalive packet for the first keepalive network session and the keepalive transmission frequency for the first keepalive network session, a transmission time for transmitting a next keepalive packet from the network device for the first keepalive network session. The network device where: the one or more processors are configured to execute a host operating system. The network device may also include the keepalive subsystem process is executed within a user space of the host operating system. The network device where the host operating system is linux. The network device where: the first virtual machine is configured to: store information for a second keepalive network session to the database, the information for the second keepalive network session including information identifying a keepalive protocol for the second keepalive network session and information identifying a time interval period for transmitting keepalive packets for the second keepalive network session; and where the keepalive subsystem process is configured to: access the information for the second keepalive network session from the database; and based at least in part on the information for the second keepalive network session stored in the database, schedule transmission of one or more keepalive packets for the second keepalive network session from the network device. The network device where the keepalive protocol for the first keepalive network session is same as the keepalive protocol for the second keepalive network session. The network device where the keepalive protocol for the first keepalive network session is different from the keepalive protocol for the second keepalive network session and the time interval period for transmitting keepalive packets for the first keepalive network session is different from the time interval period for transmitting keepalive packets for the second keepalive network session. The method further including executing, via the processor, a second virtual machine, where: the first virtual machine operates in an active mode and performs the set of functions. The method may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The method where the network device includes: a set of one or more line cards. The method may also include a backplane enabling communications between the set of line cards. The method may also include where the one or more processors and the database are located on a first line card from the set of line cards. The method further including transmitting, via a packet processor, the scheduled one or more keepalive packets from the network device via one or more ports of the network device. The method further including: determining, via the keepalive subsystem process and based at least in part on the information for the first keepalive network session stored in the database, a transmission time of a most recent transmitted keepalive packet for the first keepalive network session. The method may also include determining, via the keepalive subsystem process and based at least in part on the information for the first keepalive network session stored in the database, a keepalive transmission frequency for the first keepalive network session. The method may also include determining, via the keepalive subsystem process and based upon the transmission time of the most recent transmitted keepalive packet for the first keepalive network session and the keepalive transmission frequency for the first keepalive network session, a transmission time for transmitting a next keepalive packet from the network device for the first keepalive network session. The method where the first virtual machine is configured to store information for a second keepalive network session to the database, the information for the second keepalive network session including information identifying a keepalive protocol for the second keepalive network session and information identifying a time interval period for transmitting keepalive packets for the second keepalive network session, and the method further includes: accessing, via the keepalive subsystem process, the information for the second keepalive network session from the database. The method may also include based at least in part on the information for the second keepalive network session stored in the database, scheduling, via the keepalive subsystem process, transmission of one or more keepalive packets for the second keepalive network session from the network device. The one or more non-transitory computer-readable media where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to execute, via the processor, a second virtual machine, where: the first virtual machine operates in an active mode and performs the set of functions. The one or more non - transitory computer - readable media may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The one or more non-transitory computer-readable media where the network device includes: a set of one or more line cards. The one or more non - transitory computer - readable media may also include a backplane enabling communications between the set of line cards. The one or more non - transitory computer - readable media may also include where the one or more processors and the database are located on a first line card from the set of line cards. The one or more non-transitory computer-readable media where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to transmit, via a packet processor, the scheduled one or more keepalive packets from the network device via one or more ports of the network device. The one or more non-transitory computer-readable media where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to: determine, via the keepalive subsystem process and based at least in part on the information for the first keepalive network session stored in the database, a transmission time of a most recent transmitted keepalive packet for the first keepalive network session. The one or more non - transitory computer - readable media may also include determine, via the keepalive subsystem process and based at least in part on the information for the first keepalive network session stored in the database, a keepalive transmission frequency for the first keepalive network session. The one or more non - transitory computer - readable media may also include determine, via the keepalive subsystem process and based upon the transmission time of the most recent transmitted keepalive packet for the first keepalive network session and the keepalive transmission frequency for the first keepalive network session, a transmission time for transmitting a next keepalive packet from the network device for the first keepalive network session. The one or more non-transitory computer-readable media where the first virtual machine is configured to store information for a second keepalive network session to the database, the information for the second keepalive network session including information identifying a keepalive protocol for the second keepalive network session and information identifying a time interval period for transmitting keepalive packets for the second keepalive network session, and where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to: access, via the keepalive subsystem process, the information for the second keepalive network session from the database. The one or more non - transitory computer - readable media may also include based at least in part on the information for the second keepalive network session stored in the database, schedule, via the keepalive subsystem process, transmission of one or more keepalive packets for the second keepalive network session from the network device. The method where the first virtual machine and the keepalive subsystem process are executed by one or more processors. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer-accessible medium.

[0013] One general aspect includes a method including: executing, via a processor on a network device, a first virtual machine and a keepalive subsystem process, where the first virtual machine is configured to:. The method also includes perform a set of functions to facilitate forwarding of data packets from the network device; and store information for a first keepalive network session in the database, the information for the first keepalive network session including information identifying a keepalive protocol for the first keepalive network session and information identifying a time interval period for transmitting keepalive packets for the first keepalive network session. The method also includes accessing, via the keepalive subsystem process, the information for the first keepalive network session from the database. The method also includes based at least in part on the information for the first keepalive network session stored in the database, scheduling, via the keepalive subsystem process, transmission of one or more keepalive packets for the first keepalive network session from the network device. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.

[0014] Implementations may include one or more of the following features. The method further including executing, via the processor, a second virtual machine, where: the first virtual machine operates in an active mode and performs the set of functions. The method may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The method where the network device includes: a set of one or more line cards. The method may also include a backplane enabling communications between the set of line cards. The method may also include where the one or more processors and the database are located on a first line card from the set of line cards. The method further including transmitting, via a packet processor, the scheduled one or more keepalive packets from the network device via one or more ports of the network device. The method further including: determining, via the keepalive subsystem process and based at least in part on the information for the first keepalive network session stored in the database, a transmission time of a most recent transmitted keepalive packet for the first keepalive network session. The method may also include determining, via the keepalive subsystem process and based at least in part on the information for the first keepalive network session stored in the database, a keepalive transmission frequency for the first keepalive network session. The method may also include determining, via the keepalive subsystem process and based upon the transmission time of the most recent transmitted keepalive packet for the first keepalive network session and the keepalive transmission frequency for the first keepalive network session, a transmission time for transmitting a next keepalive packet from the network device for the first keepalive network session. The method where the first virtual machine is configured to store information for a second keepalive network session to the database, the information for the second keepalive network session including information identifying a keepalive protocol for the second keepalive network session and information identifying a time interval period for transmitting keepalive packets for the second keepalive network session, and the method further includes: accessing, via the keepalive subsystem process, the information for the second keepalive network session from the database. The method may also include based at least in part on the information for the second keepalive network session stored in the database, scheduling, via the keepalive subsystem process, transmission of one or more keepalive packets for the second keepalive network session from the network device. The one or more non-transitory computer-readable media where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to execute, via the processor, a second virtual machine, where: the first virtual machine operates in an active mode and performs the set of functions. The one or more non - transitory computer - readable media may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The one or more non-transitory computer-readable media where the network device includes: a set of one or more line cards. The one or more non - transitory computer - readable media may also include a backplane enabling communications between the set of line cards. The one or more non - transitory computer - readable media may also include where the one or more processors and the database are located on a first line card from the set of line cards. The one or more non-transitory computer-readable media where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to transmit, via a packet processor, the scheduled one or more keepalive packets from the network device via one or more ports of the network device. The one or more non-transitory computer-readable media where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to: determine, via the keepalive subsystem process and based at least in part on the information for the first keepalive network session stored in the database, a transmission time of a most recent transmitted keepalive packet for the first keepalive network session. The one or more non - transitory computer - readable media may also include determine, via the keepalive subsystem process and based at least in part on the information for the first keepalive network session stored in the database, a keepalive transmission frequency for the first keepalive network session. The one or more non - transitory computer - readable media may also include determine, via the keepalive subsystem process and based upon the transmission time of the most recent transmitted keepalive packet for the first keepalive network session and the keepalive transmission frequency for the first keepalive network session, a transmission time for transmitting a next keepalive packet from the network device for the first keepalive network session. The one or more non-transitory computer-readable media where the first virtual machine is configured to store information for a second keepalive network session to the database, the information for the second keepalive network session including information identifying a keepalive protocol for the second keepalive network session and information identifying a time interval period for transmitting keepalive packets for the second keepalive network session, and where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to: access, via the keepalive subsystem process, the information for the second keepalive network session from the database. The one or more non - transitory computer - readable media may also include based at least in part on the information for the second keepalive network session stored in the database, schedule, via the keepalive subsystem process, transmission of one or more keepalive packets for the second keepalive network session from the network device. The method where the first virtual machine and the keepalive subsystem process are executed by one or more processors. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer-accessible medium. [0015] One general aspect includes one or more non-transitory computer-readable media storing computer-executable instructions that, when executed by one or more computing devices, cause the one or more computing devices to: execute, via a processor on a network device, a first virtual machine and a keepalive subsystem process, where the first virtual machine is configured to:. The one or more non - transitory computer - readable media also includes perform a set of functions to facilitate forwarding of data packets from a network device; and store information for a first keepalive network session in the database, the information for the first keepalive network session including information identifying a keepalive protocol for the first keepalive network session and information identifying a time interval period for transmitting keepalive packets for the first keepalive network session. The one or more non - transitory computer - readable media also includes access, via the keepalive subsystem process, the information for the first keepalive network session from the database. The one or more non - transitory computer - readable media also includes based at least in part on the information for the first keepalive network session stored in the database, schedule, via the keepalive subsystem process, transmission of one or more keepalive packets for the first keepalive network session from the network device. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.

[0016] Implementations may include one or more of the following features. The one or more non-transitory computer-readable media where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to execute, via the processor, a second virtual machine, where: the first virtual machine operates in an active mode and performs the set of functions. The one or more non - transitory computer - readable media may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The one or more non- transitory computer-readable media where the network device includes: a set of one or more line cards. The one or more non - transitory computer - readable media may also include a backplane enabling communications between the set of line cards. The one or more non - transitory computer - readable media may also include where the one or more processors and the database are located on a first line card from the set of line cards. The one or more non- transitory computer-readable media where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to transmit, via a packet processor, the scheduled one or more keepalive packets from the network device via one or more ports of the network device. The one or more non-transitory computer-readable media where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to: determine, via the keepalive subsystem process and based at least in part on the information for the first keepalive network session stored in the database, a transmission time of a most recent transmitted keepalive packet for the first keepalive network session. The one or more non - transitory computer - readable media may also include determine, via the keepalive subsystem process and based at least in part on the information for the first keepalive network session stored in the database, a keepalive transmission frequency for the first keepalive network session. The one or more non - transitory computer - readable media may also include determine, via the keepalive subsystem process and based upon the transmission time of the most recent transmitted keepalive packet for the first keepalive network session and the keepalive transmission frequency for the first keepalive network session, a transmission time for transmitting a next keepalive packet from the network device for the first keepalive network session. The one or more non-transitory computer-readable media where the first virtual machine is configured to store information for a second keepalive network session to the database, the information for the second keepalive network session including information identifying a keepalive protocol for the second keepalive network session and information identifying a time interval period for transmitting keepalive packets for the second keepalive network session, and where the instructions, when executed by the one or more computing devices, cause the one or more computing devices to: access, via the keepalive subsystem process, the information for the second keepalive network session from the database. The one or more non - transitory computer - readable media may also include based at least in part on the information for the second keepalive network session stored in the database, schedule, via the keepalive subsystem process, transmission of one or more keepalive packets for the second keepalive network session from the network device. The method where the first virtual machine and the keepalive subsystem process are executed by one or more processors. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer-accessible medium.

[0017] One general aspect includes a method including: performing, via a first virtual machine, a set of functions to facilitate forwarding of data packets from the network device. The method also includes storing, via the first virtual machine, information for a first keepalive network session in the database, the information for the first keepalive network session including information identifying a keepalive protocol for the first keepalive network session and information identifying a time interval period for transmitting keepalive packets for the first keepalive network session. The method also includes accessing, via a keepalive subsystem process, the information for the first keepalive network session from the database. The method also includes based at least in part on the information for the first keepalive network session stored in the database, scheduling, via the keepalive subsystem process, transmission of one or more keepalive packets for the first keepalive network session from the network device. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.

[0018] Implementations may include one or more of the following features. The method where the first virtual machine and the keepalive subsystem process are executed by one or more processors. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer-accessible medium.

[0019] One general aspect includes the method of any of the above embodiments, further including executing, via the processor, a second virtual machine, where: the first virtual machine operates in an active mode and performs the set of functions. The method also includes the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.

[0020] One general aspect includes the method of any of the above embodiments, further including transmitting, via a packet processor, the scheduled one or more keepalive packets from the network device via one or more ports of the network device. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.

[0021] A system of one or more computers can be configured to perform particular operations or actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions. One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions. One general aspect includes a method including: reading a database configured to store data for one or more keepalive network sessions for a network device. The method also includes determining, based on reading the database, a next transmission time for one of the one or more keepalive network sessions based on a previous transmission time for the one of the one or more keepalive network sessions and a keepalive network session frequency for the one of the one or more keepalive network sessions. The method also includes placing, based at least in part on the determined next transmission time for the one of the one or more keepalive network sessions, a session identifier for the one of the one or more keepalive network sessions in a first bank of a plurality of banks within a timer queue, the first bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted within a first time period from the current time, the plurality of banks further including a second bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted after the first time period and within a second time period from the current time. The method also includes transmitting, via a packet processor of the network device, a keepalive packet within the first period of time from the current time for the one of the one or more keepalive network sessions placed in the first bank. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.

[0022] Implementations may include one or more of the following features. The method where the database is accessible by a first virtual machine and a second virtual machine being executed by a line card on the network device. The method where at least one of the first virtual machine or second virtual machine stores data for the one or more keepalive network sessions in the database. The method where: the first virtual machine operates in an active mode and performs a set of functions to facilitate forwarding of data packets from the network device. The method may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The method where the first time period is ten milliseconds. The method where the plurality of banks includes ten banks. The method where if the next transmission time is before the current time, the session identifier for the one of the one or more keepalive network sessions is placed in a highest priority bank. The network device where the database is accessible by a first virtual machine and a second virtual machine being executed by a line card on the network device. The network device where at least one of the first virtual machine or second virtual machine stores data for the one or more keepalive network sessions in the database. The network device where: the first virtual machine operates in an active mode and performs a set of functions to facilitate forwarding of data packets from the network device. The network device may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The network device where the first time period is ten milliseconds. The network device where the plurality of banks includes ten banks. The network device where if the next transmission time is before the current time, the session identifier for the one of the one or more keepalive network sessions is placed in a highest priority bank. The one or more non-transitory computer- readable media where the database is accessible by a first virtual machine and a second virtual machine being executed by a line card on the network device. The one or more non- transitory computer-readable media where at least one of the first virtual machine or second virtual machine stores data for the one or more keepalive network sessions in the database. The one or more non-transitory computer-readable media where: the first virtual machine operates in an active mode and performs a set of functions to facilitate forwarding of data packets from the network device. The one or more non - transitory computer - readable media may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The one or more non-transitory computer-readable media where the first time period is ten milliseconds. The one or more non-transitory computer-readable media where the plurality of banks includes ten banks. The one or more non-transitory computer-readable media where if the next transmission time is before the current time, the session identifier for the one of the one or more keepalive network sessions is placed in a highest priority bank. The method further including transmitting, via a packet processor of the network device, a keepalive packet within the first period of time from the current time for the one of the one or more keepalive network sessions placed in the first bank. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer-accessible medium. [0023] One general aspect includes a network device including: a database configured to store data for one or more keepalive network sessions for the network device; one or more processors executing a keepalive subsystem process, where the keepalive subsystem process is configured to:. The network device also includes read the database. The network device also includes determine, based on reading the database, a next transmission time for one of the one or more keepalive network sessions based on a previous transmission time for the one of the one or more keepalive network sessions and a keepalive network session frequency for the one of the one or more keepalive network sessions. The network device also includes place, based at least in part on the determined next transmission time for the one of the one or more keepalive network sessions, a session identifier for the one of the one or more keepalive network sessions in a first bank of a plurality of banks within a timer queue, the first bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted within a first time period from the current time, the plurality of banks further including a second bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted after the first time period and within a second time period from the current time. The network device also includes transmit, via a packet processor of the network device, a keepalive packet within the first period of time from the current time for the one of the one or more keepalive network sessions placed in the first bank. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.

[0024] Implementations may include one or more of the following features. The network device where the database is accessible by a first virtual machine and a second virtual machine being executed by a line card on ihe network device. The network device where at least one of the first virtual machine or second virtual machine stores data for the one or more keepalive network sessions in the database. The network device where: the first virtual machine operates in an active mode and performs a set of functions to facilitate forwarding of data packets from the network device. The network device may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The network device where the first time period is ten milliseconds. The network device where the plurality of banks includes ten banks. The network device where if the next transmission time is before the current time, the session identifier for the one of the one or more keepalive network sessions is placed in a highest priority bank. The one or more non-transitory computer-readable media where the database is accessible by a first virtual machine and a second virtual machine being executed by a line card on the network device. The one or more non-transitory computer-readable media where at least one of the first virtual machine or second virtual machine stores data for the one or more keepalive network sessions in the database. The one or more non-transitory computer- readable media where: the first virtual machine operates in an active mode and performs a set of functions to facilitate forwarding of data packets from the network device. The one or more non - transitory computer - readable media may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The one or more non-transitory computer-readable media where the first time period is ten milliseconds. The one or more non-transitory computer-readable media where the plurality of banks includes ten banks. The one or more non-transitory computer-readable media where if the next transmission time is before the current time, the session identifier for the one of the one or more keepalive network sessions is placed in a highest priority bank. The method further including transmitting, via a packet processor of the network device, a keepalive packet within the first period of time from the current time for the one of the one or more keepalive network sessions placed in the first bank. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer- accessible medium.

[0025] One general aspect includes one or more non-transitory computer-readable media storing computer-executable instructions that, when executed by one or more computing devices, cause the one or more computing devices to: read a database configured to store data for one or more keepalive network sessions for a network device. The one or more non - transitory computer - readable media also includes determine, based on reading the database, a next transmission time for one of the one or more keepalive network sessions based on a previous transmission time for the one of the one or more keepalive network sessions and a keepalive network session frequency for the one of the one or more keepalive network sessions. The one or more non - transitory computer - readable media also includes place, based at least in part on the determined next transmission time for the one of the one or more keepalive network sessions, a session identifier for the one of the one or more keepalive network sessions in a first bank of a plurality of banks within a timer queue, the first bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted within a first time period from the current time, the plurality of banks further including a second bank for storing entries for keepalive network sessions for which keepalive packets have to be transmitted after the first time period and within a second time period from the current time. The one or more non - transitory computer - readable media also includes transmit, via a packet processor of the network device, a keepalive packet within the first period of time from the current time for the one of the one or more keepalive network sessions placed in the first bank. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.

[0026] Implementations may include one or more of the following features. The one or more non-transitory computer-readable media where the database is accessible by a first virtual machine and a second virtual machine being executed by a line card on the network device. The one or more non-transitory computer-readable media where at least one of the first virtual machine or second virtual machine stores data for the one or more keepalive network sessions in the database. The one or more non-transitory computer-readable media where: the first virtual machine operates in an active mode and performs a set of functions to facilitate forwarding of data packets from the network device. The one or more non - transitory computer - readable media may also include the second virtual machine operates in a standby mode while the first virtual machine operates in the active mode, where the second virtual machine does not perform the set of functions when operating in the standby mode. The one or more non-transitory computer-readable media where the first time period is ten milliseconds. The one or more non-transitory computer-readable media where the plurality of banks includes ten banks. The one or more non-transitory computer-readable media where if the next transmission time is before the current time, the session identifier for the one of the one or more keepalive network sessions is placed in a highest priority bank. The method further including transmitting, via a packet processor of the network device, a keepalive packet within the first period of time from the current time for the one of the one or more keepalive network sessions placed in the first bank. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer- accessible medium.

[0027] Various embodiments are claimed directed to a system, a method, and a non- transitory computer-readable medium storing a plurality of instructions executable by one or more processors, wherein any feature mentioned in one claim category, e.g. method, can be claimed in another claim category, e.g., a system, as well. The dependencies or references back in the attached claims are chosen for formal reasons only. However any subject matter resulting from a deliberate reference back to any previous claims (in particular multiple dependencies) can be claimed as well, so that any combination of claims and the features thereof is disclosed and can be claimed regardless of the dependencies chosen in the claims. The subject-matter which can be claimed comprises not only the combinations of features as set out in the claims but also any other combination of features in the claims, wherein each feature mentioned in the claims can be combined with any other feature or combination of other features in the claims. Furthermore, any of the embodiments and features described or depicted herein can be claimed in a separate claim and/or in any combination with any embodiment or feature described or depicted herein or with any of the features of the claims. BRIEF DESCRIPTION OF THE DRAWINGS

[0028] FIG.1 is a simplified block diagram of a network device (also referred to as a “host system”) that may incorporate teachings disclosed herein according to certain embodiments.

[0029] FIG.2 is a simplified block diagram of yet another network device according to certain embodiments.

[0030] FIG.3 is a simplified block diagram of a network device including a SKAP agent (process/subsystem) according to certain embodiments.

[0031] FIG.4 illustrates interactions for keepalive setup between the various components of the network device according to some embodiments.

[0032] FIG.5A illustrates a timer queue having a plurality of buckets holding sessions IDs associated with keepalive network sessions.

[0033] FIG.5B illustrates a timer queue having a plurality of buckets holding sessions IDs associated with keepalive network sessions.

[0034] FIG.6 is a flowchart illustrating the process of the SKAP scheduler according to some embodiments. DETAILED DESCRIPTION

[0035] In the following description, for the purposes of explanation, specific details are set forth in order to provide a thorough understanding of certain inventive embodiments.

However, it will be apparent that various embodiments may be practiced without these specific details. The figures and description are not intended to be restrictive. The word “exemplary” is used herein to mean“serving as an example, instance, or illustration.” Any embodiment or design described herein as“exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments or designs.

[0036] The present disclosure relates generally to networking technologies, and more particularly to techniques for sending keepalive messages or packets. More specifically, the present disclosure relates to a network device that is configured to send non-stop keepalive messages or packets for multiple keepalive network sessions. The network device may execute a process (e.g., a software keepalive process (SKAP)) that enables the network device to continue to send keepalive messages or packets without interruption even during events such as a virtual machine switchover or in-place system upgrade. The network device may maintain a shared database of keepalive network sessions storing information that is used to schedule and send keepalive messages or packets. The shared database may be shared between multiple subsystems and programs executed by the network device. In certain embodiments, the database may be updated by a virtual machine executed by the network device and the information may then be used by the SKAP process to schedule and send out keepalive messages or packets. The shared database may be highly scalable and flexible in order to allow a variety of protocols to be supported both presently and in the future.

[0037] A keepalive network session is a network session during which the network device transmits keepalive messages (also sometimes referred to as hello messages) from the network device to its neighboring network devices at regular intervals according to some protocol. Examples of protocols that involve sending of keepalive messages include Intermediate System - Intermediate System (IS-IS), Resource Reservation Protocol (RSVP), Multiple Spanning Tree Protocol (MSTP), Link Aggregation Control Protocol (LACP), Open Shortest Path First (OSPF), Unidirectional Link Detection (UDLD), Generic Routing Encapsulation (GRE), Rapid Spanning Tree Protocol (RSTP), and others. Keepalive messages may also be referred to as keepalive packets.

[0038] During normal operation, on the network device, a first network operating subsystem may be operating in an“active” mode and a second network operating subsystem may be operating in a“standby” mode. Examples of network operating subsystems may be virtual machines. For example, a first virtual machine executed by the network device may be executing in active mode and a second virtual machine may be executing in standby mode. The virtual machine operating in active mode may perform a set of networking functions that are not performed by the second virtual machine when operating in standby mode. For example, as part of its networking functions, the active virtual machine may open and maintain one or more keepalive network sessions. In response to certain events, a failover or switchover may occur that causes the subsystem previously operating in the standby mode prior to the failover to start operating in the active mode and take over performance of the functions performed in active mode. The switchover may cause the first subsystem to start operating in the standby mode. In certain embodiments, the SKAP may be configured to schedule and cause transmission of one or more keepalive packets for an active keepalive network session during the switchover. In this manner, even during a switchover, the SKAP enables transmission of keepalive packets to be continued uninterrupted.

[0039] FIG.1 is a simplified block diagram of a network device 100 (also referred to as a “host system”) that may incorporate teachings disclosed herein according to certain embodiments. Network device 100 may be any device that is capable of receiving and forwarding packets, which may be data packets or signaling or protocol-related packets (e.g., keep-alive packets). For example, network device 100 may receive one or more data packets and forward the data packets to facilitate delivery of the data packets to their intended destinations. In certain embodiments, network device 100 may be a router or switch such as various routers and switches provided by Brocade Communications Systems, Inc. of San Jose, California.

[0040] As depicted in FIG.1, the example network device 100 comprises multiple components including one or more processors 102, a system memory 104, a packet processor or traffic manager 106, and optionally other hardware resources or devices 108. Network device 100 depicted in FIG.1 is merely an example and is not intended to unduly limit the scope of inventive embodiments recited in the claims. One of ordinary skill in the art would recognize many possible variations, alternatives, and modifications. For example, in some implementations, network device 100 may have more or fewer components than those shown in FIG.1, may combine two or more components, or may have a different configuration or arrangement of components. Network device 100 depicted in FIG.1 may also include (not shown) one or more communication channels (e.g., an interconnect or a bus) for enabling multiple components of network device 100 to communicate with each other.

[0041] Network device 100 may include one or more processors 102. Processors 102 may include single or multicore processors. System memory 104 may provide memory resources for processors 102. System memory 104 is typically a form of random access memory (RAM) (e.g., dynamic random access memory (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDR SDRAM)). Information related to an operating system and programs or processes executed by processors 102 may be stored in system memory 104. Processors 102 may include general purpose microprocessors such as ones provided by Intel®, AMD®, ARM®, Freescale Semiconductor, Inc., and the like, that operate under the control of software stored in associated memory.

[0042] As shown in the example depicted in FIG.1, a host operating system 110 may be loaded in system memory 104 and executed by one or more processors 102. Host operating system 110 may be loaded, for example, when network device 104 is powered on. In certain implementations, host operating system 110 may also function as a hypervisor and facilitate management of subsystems (e.g., virtual machines) and other programs that are executed by network device 100. Managing virtual machines may include partitioning resources of network device 100, including processor and memory resources, between the various programs. A hypervisor is a program that enables the creation and management of virtual machine environments including the partitioning and management of processor, memory, and other hardware resources of network device 100 between the virtual machine environments. A hypervisor enables multiple guest operating systems (GOSs) to run concurrently on network device 100.

[0043] As an example, in certain embodiments, host operating system 110 may include a version of a KVM, which is an open source virtualization infrastructure that supports various operating systems including Linux, Windows®, and others. Other examples of hypervisors include solutions provided by VMWare®, Xen®, and others. Linux KVM is a virtual memory system, meaning that addresses seen by programs loaded in executed in system memory are virtual memory addresses that have to be mapped or translated to physical memory addresses of the physical memory. This layer of indirection enables a program running on network device 100 to have an allocated virtual memory space that is larger than the system’s physical memory.

[0044] In the example depicted in FIG.1, the memory space allocated to operating system 110 (operating as a hypervisor) is divided into a kernel space 112 and a user space 114 (also referred to as host user space). Multiple virtual machines and host processes may be loaded into host user space 114 and executed by processors 102. The memory allocated to a virtual machine (also sometimes referred to as a guest operating or GOS) may in turn include a kernel space portion and a user space portion. A virtual machine may have its own operating system loaded into the kernel space of the virtual machine. A virtual machine may operate independently of other virtual machines executed by network device 100 and may be unaware of the presence of the other virtual machines.

[0045] A virtual machine’s operating system may be the same as or different from the host operating system 110. When multiple virtual machines are being executed, the operating system for one virtual machine may be the same as or different from the operating system for another virtual machine. In this manner, hypervisor 110 enables multiple guest operating systems to share the hardware resources (e.g., processor and memory resources) of network device 100.

[0046] For example, in the embodiment depicted in FIG.1, two virtual machines VM-1 116 and VM-2118 have been loaded into host/guest user space 114 and are being executed by processors 102. VM-1116 has a kernel space 126 and a user space 124. VM-2118 has its own kernel space 130 and user space 128. Typically, each virtual machine has its own secure and private memory area that is accessible only to that virtual machine. In certain implementations, the creation and management of virtual machines 116 and 118 may be managed by hypervisor 110, which may be, for example, KVM. While only two virtual machines are shown in FIG.1, this is not intended to be limiting. In alternative

embodiments, any number of virtual machines may be loaded and executed.

[0047] Various other host programs or processes may also be loaded into guest user space 114 and be executed by processors 102. For example, as shown in the embodiment depicted in FIG.1, two host processes 120 and 122 have been loaded into guest user space 114 and are being executed by processors 102. While only two host processes are shown in FIG.1, this is not intended to be limiting. In alternative embodiments, any number of host processes may be loaded and executed.

[0048] In certain embodiments, a virtual machine may run a network operating system (NOS) (also sometimes referred to as a network protocol stack) and be configured to perform processing related to forwarding of packets from network device 100. As part of this processing, the virtual machine may be configured to maintain and manage routing information that is used to determine how a data packet received by network device 100 is forwarded from network device 100. In certain implementations, the routing information may be stored in a routing database (not shown) stored by network device 100. The virtual machine may then use the routing information to program a traffic manager 106, which then performs packet forwarding using the programmed information, as described below.

[0049] The virtual machine running the NOS may also be configured to perform processing related to managing sessions for various networking protocols being executed by network device 100. These sessions may then be used to send signaling packets (e.g., keep-alive packets) from network device 100. Sending keep-alive packets enables session availability information to be exchanged between two ends of a forwarding or routing protocol.

[0050] In certain implementations, redundant virtual machines running network operating systems may be provided to ensure high availability of the network device. In such implementations, one of the virtual machines may be configured to operate in an“active” mode (this virtual machine is referred to as the active virtual machine) and perform a set of functions while the other virtual machine is configured to operate in a“standby” mode (this virtual machine is referred to as the standby virtual machine) in which the set of functions performed by the active virtual machine are not performed. The standby virtual machine remains ready to take over the functions performed by the active virtual machine.

Conceptually, the virtual machine operating in active mode is configured to perform a set of functions that are not performed by the virtual machine operating in standby mode. For example, the virtual machine operating in active mode may be configured to perform certain functions related to routing and forwarding of packets from network device 100, which are not performed by the virtual machine operating in standby mode. The active virtual machine also takes ownership of and manages the hardware resources of network device 100.

[0051] Certain events may cause the active virtual machine to stop operating in active mode and for the standby virtual machine to start operating in the active mode (i.e., become the active virtual machine) and take over performance of the set of functions related to network device 100 that are performed in active mode. In one example, the process of a standby virtual machine becoming the active virtual machine is referred to as a failover or switchover. As a result of the failover, the virtual machine that was previously operating in active mode prior to the failover may operate in the standby mode after the failover. A failover enables the set of functions performed in active mode to be continued to be performed without interruption. Redundant virtual machines used in this manner may reduce or even eliminates the downtime of network device 100’s functionality, which may translate to higher availability of network device 100. The set of functions that are performed in active mode, and which are not performed in by the active virtual machine and not performed by the standby virtual machine may differ from one network device to another.

[0052] Various different events may cause a failover to occur. Failovers may be voluntary or involuntary. A voluntary failover may be purposely caused by an administrator of the network device or network. For example, a network administrator may, for example, using a command line instruction, purposely cause a failover to occur. There are various situations when this may be performed. As one example, a voluntary failover may be performed when software for the active virtual machine is to be brought offline so that it can be upgraded. As another example, a network administrator may cause a failover to occur upon noticing performance degradation on the active virtual machine or upon noticing that software executed by the active computing domain is malfunctioning.

[0053] An involuntary failover typically occurs due to some critical failure in the active virtual machine. This may occur, for example, when some condition causes the active virtual machine to be rebooted or reset. This may happen, for example, due to a problem in the virtual machine kernel, critical failure of software executed by the active virtual machine, and the like. An involuntary failover causes the standby virtual machine to automatically become the active virtual machine.

[0054] While many examples herein describe the virtual machine failover or switchover process, the embodiments described herein can apply to any instance where a virtual machine goes down.

[0055] In the example depicted in Fig.1, VM-1116 is shown as operating in active mode and VM-2118 is shown as operating in standby mode. The active-standby model enhances the availability of network device 100 by enabling the network device to support various high-availability functionality such as graceful restart, non-stop routing (NSR), and the like.

[0056] During normal operation of network device 100, there may be some messaging that takes place between the active virtual machine and the standby virtual machine. For example, the active virtual machine may use messaging to pass network state information to the standby virtual machine. The network state information may comprise information that enables the standby virtual machine to become the active virtual machine upon a failover or switchover in a non-disruptive manner. Various different schemes may be used for the messaging, including but not restricted to Ethernet-based messaging, Peripheral Component Interconnect (PCI)-based messaging, shared memory based messaging, and the like.

[0057] Hardware resources or devices 108 may include without restriction one or more field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), I/O devices, and the like. I/O devices may include devices such as Ethernet devices, PCI Express (PCIe) devices, and others. In certain implementations, some of hardware resources 108 may be partitioned between multiple virtual machines executed by network device 100 or, in some instances, may be shared by the virtual machines. One or more of hardware resources 108 may assist the active virtual machine in performing networking functions. For example, in certain implementations, one or more FPGAs may assist the active virtual machine in performing the set of functions performed in active mode.

[0058] As previously indicated, network device 100 may be configured to receive and forward packets to facilitate delivery of the packets to their intended destinations. The packets may include data packets and signal or protocol packets (e.g., keep-alive packets). The packets may be received and/or forwarded using one or more ports 107. Ports 107 represent the I/O plane for network device 100. A port within ports 107 may be classified as an input port or an output port depending upon whether network device 100 receives or transmits a packet using that port. A port over which a packet is received by network device 100 may be referred to as an input port. A port used for communicating or forwarding a packet from network device 100 may be referred to as an output port. A particular port may function both as an input port and an output port. A port may be connected by a link or interface to a neighboring network device or network. In some implementations, multiple ports of network device 100 may be logically grouped into one or more trunks. [0059] Ports 107 may be capable of receiving and/or transmitting different types of network traffic at different speeds, such as speeds of 1 Gigabits per second (Gbps), 10 Gbps, 100 Gbps, or more. Various different configurations of ports 107 may be provided in different implementations of network device 100. For examples, configurations may include 7210 Gbps ports, 6040 Gbps ports, 36100 Gbps ports, and various other combinations.

[0060] In certain implementations, upon receiving a data packet via an input port, network device 100 is configured to determine an output port to be used for transmitting the data packet from network device 100 to facilitate communication of the packet to its intended destination. Within network device 100, the packet is forwarded from the input port to the determined output port and then transmitted or forwarded from network device 100 using the output port.

[0061] Various different components of network device 100 are configured to

cooperatively perform processing for determining how a packet is to be forwarded from network device 100. In certain embodiments, packet processor or traffic manager 106 may be configured to perform processing to determine how a packet is to be forwarded from network device 100. In certain embodiments, packet processor or traffic manager 106 may be configured to perform packet classification, modification, forwarding and Quality of Service (QoS) functions. As previously indicated, traffic manager 106 may be programmed to perform forwarding of data packets based upon routing information maintained by the active virtual machine. In certain embodiments, upon a receiving a packet, traffic manager 106 is configured to determine, based upon information extracted from the received packet (e.g., information extracted from a header of the received packet), an output port of network device 100 to be used for forwarding the packet from network device 100 such that delivery of the packet to its intended destination is facilitated. Traffic manager 106 may then cause the packet to be forwarded within network device 100 from the input port to the determined output port. The packet may then be forwarded from network device 100 to the packet’s next hop using the output port.

[0062] In certain instances, traffic manager 106 may be unable to determine how to forward a received packet. Traffic manager 106 may then forward the packet to the active virtual machine, which may then determine how the packet is to be forwarded. The active virtual machine may then program traffic manager 106 for forwarding that packet. The packet may then be forwarded by traffic manager 106. [0063] In certain implementations, packet processing chips or merchant ASICs provided by various 3 rd -party vendors may be used for traffic manager 106 depicted in FIG.1. For example, in some embodiments, Ethernet switching chips provided by Broadcom® may be used. For example, in some embodiments, the Jericho packet processor and traffic manager chip (BCM88670) provided by Broadcom® may be used as traffic manager 106.

[0064] FIG.2 is a simplified block diagram of yet another example network device 200. Network device 200 depicted in FIG.2 is commonly referred to as a chassis-based system (network device 100 depicted in FIG.1 is sometimes referred to as a“pizza-box” system). Network device 200 may be configured to receive and forward packets, which may be data packets or signaling or protocol-related packets (e.g., keep-alive packets). Network device 200 comprises a chassis that includes multiple slots, where a card or blade or module can be inserted into each slot. This modular design allows for flexible configurations, with different combinations of cards in the various slots of the network device for supporting differing network topologies, switching needs, and performance requirements.

[0065] In the example depicted in FIG.2, network device 200 comprises multiple line cards (including first line card 202 and a second line card 204), two management cards/modules 206, 208, and one or more switch fabric modules (SFMs) 210. A backplane 212 is provided that enables the various cards/modules to communicate with each other. In certain embodiments, the cards may be hot swappable, meaning they can be inserted and/or removed while network device 200 is powered on. In certain implementations, network device 200 may be a router or a switch such as various routers and switches provided by Brocade Communications Systems, Inc. of San Jose, California.

[0066] Network device 200 depicted in FIG.2 is merely an example and is not intended to unduly limit the scope of inventive embodiments recited in the claims. One of ordinary skill in the art would recognize many variations, alternatives, and modifications. For example, in some embodiments, network device 200 may have more or fewer components than shown in FIG.2, may combine two or more components, or may have a different configuration or arrangement of components.

[0067] In the example depicted in FIG.2, network device 200 comprises two redundant management modules 206, 208. The redundancy enables the management modules to operate according to the active-standby model, where one of the management modules is configured to operate in standby mode (referred to as the standby management module) while the other operates in active mode (referred to as the active management module). The active management module may be configured to perform management and control functions for network device 200 and may represent the management plane for network device 200. The active management module may be configured to execute applications for performing management functions such as maintaining routing tables, programming the line cards (e.g., downloading information to a line card that enables the line card to perform data forwarding functions), and the like. In certain embodiments, both the management modules and the line cards act as a control plan that programs and makes programming decisions for packet processors or traffic managers in a network device. In a chassis-based system, a management module may be configured as a coordinator of multiple control planes on the line cards.

[0068] When a failover or switchover occurs, the standby management module may become the active management module and take over performance of the set of functions performed by a management module in active mode. The management module that was previously operating in active mode may then become the standby management module. The active-standby model in the management plane enhances the availability of network device 200, allowing the network device to support various high-availability functionality such as graceful restart, non-stop routing (NSR), and the like.

[0069] In the example depicted in FIG.2, management module 206 is shown as operating in active mode and management module 208 is shown as operating in standby mode.

Management modules 206 and 208 are communicatively coupled to the line cards and switch fabric modules (SFMs) 210 via backplane 212. Each management module may comprise one or more processors, which could be single or multicore processors and associated system memory. The processors may be general purpose microprocessors such as ones provided by Intel®, AMD®, ARM®, Freescale Semiconductor, Inc., and the like, which operate under the control of software stored in associated memory.

[0070] A switch fabric module (SFM) 210 may be configured to facilitate communications between the management modules 206, 208 and the line cards of network device 200. There can be one or more SFMs in network device 200. Each SFM 210 may include one or more forwarding elements (FEs) 218. The forwarding elements provide an SFM the ability to forward data from an input to the SFM to an output of the SFM. An SFM may facilitate and enable communications between any two modules/cards connected to backplane 212. For example, if data is to be communicated from one line card 202 to another line card 206 of network device 200, the data may be sent from the first line card 210 to SFM 210, which then causes the data to be communicated to the second line card using backplane 212. Likewise, communications between management modules 206, 208 and the line cards of network device 200 are facilitated using SFMs 210.

[0071] In the example depicted in FIG.2, network device 200 comprises multiple line cards including line cards 202 and 204. Each line card may comprise a set of ports that may be used for receiving and forwarding packets. The ports of a line card may be capable of receiving and/or transmitting different types of network traffic at different speeds, such as speeds of 1 Gbps, 10 Gbps, 100 Gbps, or more. Various different configurations of lien cards ports may be provided in network device 200. For examples, configurations may include four line cards each with 7210 Gbps ports, eight line cards each with 6040 Gbps ports, a line card with 36100 Gbps ports, and various other combinations.

[0072] Each line card may include one or more single or multicore processors, a system memory, a traffic manager, and one or more hardware resources. In certain implementations, the components on a line card may be configured similar to the components of network device 100 depicted in FIG.1 (components collectively represented by reference 150 from FIG.1 and also shown in line cards 202, 204 in FIG.2).

[0073] A packet may be received by network device 200 via a port on a particular line card. The port receiving the packet may be referred to as the input port and the line card as the source/input line card. The traffic manager on the input line card may then determine, based upon information extracted from the received packet, an output port to be used for forwarding the received packet from network device 200. The output port may be on the same input line card or on a different line card. If the output port is on the same line card, the packet is forwarded by the traffic manager on the input line card from the input port to the output port and then forwarded from network device 200 using the output port. If the output port is on a different line card, then the packet is forwarded from the input line card to the line card containing the output port using backplane 212. The packet is then forwarded from network device 200 by the traffic manager on the output line card using the output port.

[0074] In certain instances, the traffic manager on the input line card may be unable to determine how to forward a received packet. The traffic manager may then forward the packet to the active virtual machine on the line card, which then determines how the packet is to be forwarded. The active virtual machine may then program the traffic manager on the line card for forwarding that packet. The packet may then be forwarded to the output port (which may be on input line card or some other line card) the by that traffic manager and then forwarded from network device 200 using via the output port.

[0075] In certain instances, the active virtual machine on an input line card may be unable to determine how to forward a received packet. The packet may then be forwarded to the active management module, which then determines how the packet is to be forwarded. The active management module may then communicate the forwarding information the line cards, which may then program their respective traffic managers based upon the information. The packet may then be forwarded to the line card containing the output port (which may be on input line card or some other line card) and then forwarded from network device 200 using via the output port.

Software Keepalive Process (SKAP) Architecture

[0076] Accordingly, a need exists to be able to transmit keepalive messages or packets via software when a switchover is being performed from an active subsystem (e.g., virtual machine) to a standby subsystem. The following description describes a software keepalive process (SKAP) along with the supporting infrastructure and user Application Program Interfaces (APIs). The SKAP may be executed by a software keepalive subsystem.

[0077] Existing solutions use Field Programmable Gate Arrays (FPGAs) or in-house Application Specific Integrated Circuits (ASICs) to provide hardware keepalive capabilities. However, these solutions typically have limited keepalive support when implemented in hardware. Thus, a SKAP may provide advantages over a hardware-based keepalive process, especially in situations involving switchover between network operating subsystems.

[0078] It may be desirable for the SKAP to support different keepalive protocols and their associated requirements. The following table illustrates some (but not all) of the

requirements of various keepalive protocols:

Table 1, above, describes the requirements of various keepalive protocols. For example, the Intermediate System to Intermediate System (IS-IS) routing protocol supports a packet size between 1500-max bytes, a minimum period of 1 second, 240 max sessions per line card (LC), a total bandwidth per LC of 2,880 Kbps,. While this is just one example, the SKAP design may support keepalives for protocols having even the most aggressive keepalive timers. For example, the SKAP may support a maximum of 68,000 sessions and a total bandwidth from the LC CPU of 191 Mbps per line card.

[0079] A majority of networking protocols are session based with respect to keepalive sessions. In other words, peers of a protocol exchange keepalive messages or packets (“heartbeats”) to establish and allow continuation of the connectivity amongst them. For a keepalive network session, failure to send the keepalive messages or packets in a particular time interval defined by a protocol for that session may result in that session being disconnected and cleanup of the session. Thus, the SKAP has to ensure that the keepalive messages or packets are transmitted continuously within the expected time intervals.

These keepalive messages or packets also need to be sent during certain scenarios such as. during included software upgrades (a.k.a In Service software Upgrade(ISSU)), planned Active-Standby switchover, unplanned failover, etc., to ensure that the session is not unduly disconnected. [0080] FIG.3 is a simplified block diagram of a network device 200 including a SKAP agent 320 according to certain embodiments. The SKAP agent 320 may implement the functionalities of the SKAP described above. SKAP agent 320 is external to both the active virtual machine 116 and the standby virtual machine 118 executing on a line card. SKAP agent 320 may also be referred to as a SKAP process or SKAP subsystem. In certain embodiment, SKAP agent 320 executes on a line card in host space 114 depicted in FIG.1 (e.g., process-1120 depicted in FIG.1 may be SKAP agent 320). The network device 200 thus includes a system external to the active system that is present when the active VM 116 may fail. The management card 206 may include a network OS running multiple protocols that make inter-process communication (IPC) calls to the active VM 116 that may register and setup the keepalive network sessions along with its out-port duration, packet content, and sequence requirements. Information related to the keepalive network sessions may be stored in a shared database 310. In certain embodiments, the virtual machine operating in active mode (e.g. VM 116) may write keepalive network sessions data to the shared database 310. The SKAP agent 320 may read from the inter-virtual machine shared database 310 and send out keepalive packets using a special scheduler (e.g., SKAP scheduler). In an event that the active VM 116 fails, the standby VM 118 may start the takeover process taking anywhere from a few seconds to a few minutes to complete. Since the SKAP agent 320 is external to the active VM 116 and standby VM 118, and still has access to the shared database 310, keepalive messages or packets may be continue to be sent during the switchover since the SKAP agent 320 is responsible for the transmission of the keepalive messages.

[0081] When the active VM 116 enters into standby mode and the standby VM 118 becomes the“active VM”, the shared database 310 may remain unchanged and the keepalives may actively be exchanged, despite the switchover, keeping the remote peer protocols “happy”.

[0082] The SKAP may provide a number of advantages. For example, the SKAP may allow for timeliness (packets are sent regardless of CPU usage), granularity (preciseness), in service software upgrade (ISSU) support (by virtue of the shared database 310 and redundancy), failover (switchover) support, and support for sequence numbers for certain protocols like Unidirectional Link Detection (UDLD).

[0083] Further, the architecture described in FIG.3 may be a“lockless architecture.” It can be appreciated that achieving mutual exclusion between the virtual machines and the SKAP agent 320 can be expensive. The lowest level of granularity (e.g., 50ms) may not be achieved if 68,000 rows are needed to take locks. Thus, 32 bit atomic reads and ownership can be used to achieve the mutual exclusion. This can be done by assuming that the latest update to the shared database 310 may not be needed for the scheduler immediately. Each item can be added to the shared database 310 by the active VM 116 or standby VM 118 with a lock bit set and an ownership bit. If the SKAP agent 320 keeps checking for a valid bit and finds that an invalid bit is set, the SKAP agent 320 may ignore that particular entry if ownership is still with the SKAP agent 320. The active VM 116 may first change the valid bit and then the ownership bit. With a 32 bit read being atomic, there may not be a mutual exclusion issue. Once the active VM 116 changes the ownership, if the valid bit is 0 the entry may be ignored. If the entry made is valid, the entry may be understood to be updated/added and the SKAP agent 320 may then take care of it by changing the ownership to SKAP agent 320.

[0084] Additionally, the architecture described may have distributed data structures. The inter virtual machine shared database 310 is available to the VMs 116-118 and the SKAP agent 320 which runs parallel to the VMs in the hypervisor. Hence the shared database 310 can be accessed based on an offset from the memory mapped shared memory. The shared database 310 may be a simple array that allows access based on a keepalive entry index for deletes and updates. Addition may always done on the first available entry for a specific protocol group, e.g., keepalive interval order. The maintenance of the free list can also done as an array that references the database table, e.g., an index table or list per protocol group. An additional Georgy Adelson-Velsky and Evgenii Landis' (AVL) tree may be maintained for easy traversal. The AVL tree may be offset based such that both the VMs 116-118 and the SKAP agent 320 can access the tree seamlessly.

[0085] FIG.4 illustrates interactions for keepalive setup between the Hardware Subsystem Layer-User Agent (HSLUA) 410, shared database 310 within an inter-VM shared memory (IVSHMEM) 420, and SKAP agent 320 according to some embodiments. The HSLUA 410, IVSHMEM 420, and SKAP agent 320 may reside within the first line card 202.

[0086] At step 1, the management module / card protocol modules may transmit inter- process communication (IPC) messages to the first line card 202 in order to setup the keepalive transmits. For example, management card 206 may transmit the IPC messages to the first line card 202 via backplane 212. The IPC messages from the management card 206 may be received by a KA processing thread 412 within the HSLUA 410. [0087] At step 2, after the KA processing thread 412 receives the IPC messages from the management card 206, the HSLUA 410 may allocate a software keepalive packet buffer to build the keepalive packet. The software keepalive packet buffer may be allocated within a packet DMA buffer 424 that resides within the IVSHMEM 420.

[0088] At step 3, after the HSLUA 410 allocates a software keepalive packet buffer to build the keepalive packet, the HSLUA 410 may call the software keepalive API library 414 to set up the session as an entry in the shared database 310 located within the IVSHMEM 420. The shared database 310 entry may point to the packet buffer allocated in step 2, above.

[0089] At step 4, after the software keepalive API library 414 sets up the keepalive network session as an entry in the shared database 310, the SKAP main thread 322 may scan the shared database 310 and may determine the keepalive network sessions to be added to a transmit queue 324 based on the keepalive network session’s associated transmit interval. The SKAP main thread 322 may execute within the SKAP agent 320.

[0090] At step 5, after the SKAP main thread 322 determines the sessions to be added to a transmit queue 324, SKAP main thread 322 may add the session to the transmit queue 324. The transmit queue 324 may be a sorted queue (tree) that allows quick checking of sessions that are ready to be transmitted.

[0091] At step 6, after the SKAP main thread 322 adds the session to the transmit queue 324, a SKAP packet transmit driver 326 checks the transmit queue 324 for sessions in the timer queue 324 that are ready to be transmitted. A timer loop 327 within the SKAP main thread 322 may dictate the predefined interval at which the SKAP packet transmit driver 326 wakes up to check the transmit queue 324 for sessions in the transmit queue 324 that are ready to be transmitted.

[0092] At step 7, after the SKAP packet transmit driver 326 checks the transmit queue 324 for sessions that are ready to be transmitted, the SKAP packet transmit driver 326 may transmit packets for keepalive network sessions that are ready to be transmitted.

[0093] At step 8, after the SKAP packet transmit driver 326 transmits packets for the keepalive network sessions that are ready to be transmitted, the SKAP packet transmit driver 326 may update the transmit time stamp at which packets were transmitted for the keep alive sessions. [0094] In one example, the SKAP main thread 322 may use timer loop 327 at a period of 10 ms in order to schedule packets to be sent out. The sessions may be added to the transmit queue 324 which may be sorted in order of expiration time for efficiency. The SKAP packet transmit driver 326 may wake up every 10 ms (when the timer loop 327 timer expires) and may check the timer queue 324 for any sessions that have expired timers. Packets may be sent for any sessions that have expired timers. When adding sessions to the timer queue 324, a random time from 1 ms to 100 ms may be added to the initial timer expiration to stagger the timer expirations of each session. Doing so may spread out session expiration times which may prevent many sessions from expiring at the same time and causing a higher CPU load.

[0095] The shared database 310 may contain a total of approximately 68,000 entries (keepalive network sessions). The following table illustrates attributes stored for each keepalive network session stored in the shared database 310:

Table 2: Shared database entry fields

[0096] The following table shows exemplary contents of the shared database 310:

Table 3: Example shared database entries

[0097] As shown in Table 3 above, the following attributes may be stored for each keepalive network session in the shared database 310: entry valid bit, lock bit, user, time interval, protocol, sequence ID start, sequence field offset, packet offset, and registration time. While Table 3 depicts just a few of the attributes illustrated in Table 2, any number of attributes illustrated in Table 2 may be stored for each

Software Keepalive Process (SKAP) Scheduler

[0098] As described above, the SKAP agent 320 may include a scheduler (SKAP scheduler) for sending the keepalive messages/packets stored in the shared database 310. A scheduler may be important because applications require real-time scheduling support of a very large number of periodic timers for various keepalive network sessions. The software- based keepalive process is an example of such an application. The SKAP agent 320 may have real-time requirements, may need to be able to scale to tens of thousands of periodic keepalive network sessions, and may need to be able to adjust for bounded latency and jitter to support some requirements of stricter networking protocols.

[0099] Different network protocols may have different keepalive requirements. In some embodiments, the SKAP scheduler may accommodate a max of 68,528 sessions (e.g., in the case of fully scaled and most aggressive timers). In some other embodiments, even more keepalive network sessions may be supported.

[0100] Referring again to FIG.4, the transmit queue 324 , the SKAP packet transmit driver 326 and SKAP main thread 322 may together make up the SKAP scheduler. The SKAP scheduler may interface with the shared database 310 and the packet DMA buffer 424, which both reside within the IVSHMEM 420. As described above, the IVSHMEM 420 can be accessed by the VMs and the host. In some embodiments, the SKAP scheduler may make use of a single monotonic timer having a period of 10 ms. Each 10 ms time period may be regarded as one“time tick”. The time tick value may reset for every 100 ms (e.g., 10 ticks). The shared database 310 may be scanned every 100 ms (e.g., 10 time ticks) and contents may be placed in the timer queue 324. The contents of the timer queue 324 may include the session IDs of the keepalives to be transmitted within the next 100 ms. The timer queue 324 may be divided to into a set of 10 banks, each bank corresponding to a 10 ms time period. In some embodiments, each bank may hold up to 4500 packets as the SKAP packet transmit driver 322 can transmit up to 4500 packets per every 10 ms interval.

[0101] In addition to the packet DMA buffer 424 and the shared database 310, the

IVSHMEM 420 may also contain a time stamp that can be incremented every 10 ms, and a max session ID field indicating the maximum session ID allocated by the HSLUA. The time stamp field within the IVSHMEM 420 can be used to manage VM/host time synchronization. The host scheduler may increment a 64-bit time stamp field every 10 ms. The max session ID field may be updated by the HSLUA. This may help in optimizing the SKAP scheduler 710 to check for valid sessions up until the present max session ID.

[0102] FIG.5A illustrates a timer queue 324 having a plurality of buckets 520 holding sessions IDs associated with keepalive network sessions, according to some embodiments. The timer queue 324 may be configured to store session IDs of keepalive network sessions that need to be transmitted within the next 100 ms time period. The SKAP scheduler may scan valid entries in the shared database 310 for packet transmit scheduling. Sessions to be transmitted in the next 100 ms may be added to an appropriate 10 ms bank of a transmit queue.

[0103] Typically, a session timeout of most protocols occurs after three missed keepalive packets. After three missed keepalive packets, the protocol may consider the session link to be broken or down. As described above, when a session starts, session information may be written into the shared database 310. Every 100 ms, the SKAP scheduler may scan the shared database 310 to determine the number of keepalive packets that need to be sent over the next 100 ms time period and at which time period within the next 100 ms they actually need to transmitted. Any keepalive network sessions that need to be transmitted in the next 100 ms (based on the last transmit time), can be moved into the timer queue 324. As shown in the figure, each bucket 510 displays a session ID for the particular session. The sessions can be populated across the ten banks 520 depicted. Each bank 520 may hold the session IDs within the buckets 510 for the keepalive network sessions that need to be transmitted within the time period defined by the particular bank 520. For example,“bank 2” may contain the sessions IDs within each bucket 510 for keepalive network sessions that need to be transmitted within the next 30 ms time period. The session IDs stored in the buckets 510 associated with“bank 2” include session IDs 301, 302, 334, 336, 435. In another example, “bank 4” may contain the session IDs within each bucket 510 for keepalive network sessions that need to be transmitted within the next 50 ms time period. The session IDs stored in the buckets 510 associated with“bank 4” include 270, 300, 317, 470, 489. While 100 ms is used as the example time period when the SKAP scheduler scans the shared database 310, any time period may be used in other embodiments.

[0104] Every 100ms, the shared database 310 may be scanned again and the timer queue 324 may be update to reflect the keepalives that need to be transmitted over the next 100ms from time t=0. For example, looking at the timer queue 324 depicted in the figure, at time t=0 keepalives for the following session IDs (in bank0) need to be transmitted need to be transmitted in the next 10ms: 1, 2, 3, 4, 5, 6. At time t=10ms, the keepalives for the session IDs in bank0 may have been transmitted and the process for transmitting the keepalives associated with the session IDs in bank1 may begin. Similarly, the keepalives for the session IDs stored in each bank 510 may be transmitted at every 10ms interval. For example, at time t=20ms, the keepalives for the session IDs in bank2 may begin, and so on and so forth. This is depicted with respect to FIG.5B. When keepalives associated with session IDs in bank 9 are transmitted, the banks 510 may be cleared and the process my start over by scanning the shared database 310 and populating the banks 510 again.

[0105] In some embodiments, the SKAP scheduler may be able to auto adjust for skew. For example, when a keepalive network session has a time interval less than one second and greater than or equal to 100 ms, the session may be randomly distributed in the nearest five banks within the range of the original bank. In the case where the session has a time interval greater than one second, the session may be randomly distributed in any of the 10 banks. In the case where the session has a time interval below 100 ms, no randomizing of the session distribution may take place. By randomizing the keepalive network sessions that do not need to be transmitted within the next 100 ms, more flexibility may be achieved for transmitting the keepalive packets since there is no urgency in transmitting keepalive network sessions that do not need to be transmitted within the next 100ms. Keepalive network sessions that need to be transmitted with more urgency (e.g., in the next 100ms) may be placed in the banks 510 with lower chances of the banks 510 being filled since the less urgent keepalive network sessions may be randomized.

[0106] Additionally, overflow may be enabled for sessions which missed being assigned to a bank in the last setup cycle due the banks being full. This particular session entry may be given priority to add into current transmission setup cycle. For example, if a particular keepalive network session could not be placed in any of the banks 510 due to the banks 510 already being filled, that particular keepalive network session may be given priority for bank 510 assignment in the next cycle (e.g. at t=100ms) of setting up the timer queue 324. These concepts may be further understood in the following description of FIG.6.

[0107] FIG.6 is a flowchart 600 illustrating the process of the SKAP scheduler according to some embodiments. The flowchart 600 illustrates the process of placing session IDs associated with the keepalive network sessions for which keepalive packets need to be transmitted into appropriate banks within the timer queue. The process of scheduling keepalive network sessions begins at block 610.

[0108] At block 612, each keepalive network session entry in the shared database is scanned. The keepalive network session entries may have been placed into the shared database by one or more virtual machines executing on the network device. The shared database may reside within a line card and be accessible by the one or more virtual machines. The shared database more store entries for keepalive network sessions associated with one or more protocols running n the network device.

[0109] At block 614, after each entry in the shared database is scanned, a determination is made whether the keepalive network session entries in the shared database are valid and unlocked. If they are both valid and unlocked, the process may continue to block 616.

Otherwise, if an entry is invalid or locked, the scheduling for that particular entry may not occur and the process may end at block 642.

[0110] At block 616, after the determination is made that the entry is valid and unlocked, a determination is made whether the entry is an overflow entry. An overflow entry may be considered an entry for a keepalive network session that for some reason was not able to be assigned to a particular bank in the timer queue for keepalive scheduling. One example of such a reason is that no banks were available in the timer queue for the previous cycle. Accordingly, if an entry is an overflow entry it may be given the highest priority for the present timer queue cycle and may be assigned to bank0, where entries within bank0 will be transmitted within the next 10ms (block 618). The process may then continue from block 618 to block 640 where the session entry may be placed in the appropriate bank within the timer queue.

[0111] Otherwise, if the entry is not an overflow entry, the process may continue to block 620.

[0112] At block 620, after determining that the entry is not an overflow entry, a next transmission time for the entry is determined based on its last transmit time and the frequency of the keepalive transmission. For example, the time for the next keepalive transmission for the session may be determined by taking the last keepalive transmission time for the session and adding the keepalive frequency to it.

[0113] At block 622, after determining the next keepalive transmission time for the session entry, a determination is made whether the determined next keepalive transmission time for the session entry falls within the next 100ms. If it is determined that the next keepalive transmission time for the session entry does not fall within the next 100ms, a determination is made whether the next transmit time is before the current time (block 626). In other words, a determination is made whether the keepalive transmission was missed for the particular session entry and the transmission is now overdue. If the transmission was missed, the session entry may be assigned bank0 (block 628) and then placed into bank0 (block 640). Otherwise, if the next transmission time for the session entry is not before the current time, the process may end at block 642. In other words, it may be determined that session entry is not overdue and the next keepalive transmission time does not fall within the next 100ms, so it may not be imperative to schedule the keepalive transmission for the session entry during the current 100ms cycle of the timer queue.

[0114] Referring again to block 622, if a determination is made that the next transmission time for the session entry is within the next 100ms of the current time, the next transmission time may be divided by modulo 10 to determine a bank ID for the session entry. For example, if the next transmission time for the session entry is 54ms from the current time, the bank ID for the session entry may be determined to be 5. The process may then continue to block 630. [0115] At block 630, after the bank ID for the session entry is determined, a determination is made whether the keepalive frequency associated with the session entry is greater than or equal to 100ms and less than 1s. If is determined that the keepalive frequency associated with the session entry is greater than or equal to 100ms and less than 1s, the bank assignment for the entry may be randomized based on the bank ID determined in block 624 and the frequency of the session entry. For example, if the bank ID determined in block 624 is less than 5, the session entry may be assigned to a random bank between bank0-bank4 (e.g., the 5 highest banks in terms of priority). Otherwise, if the bank ID determined in blocked 624 is greater than or equal to 5, the session entry may be assigned randomly anywhere between bank0-bank9 (e.g., any of the ten banks). Once the bank assignment of the session entry is complete, the session entry may be added to the appropriate bank in the timer queue (block 640).

[0116] Referring again to block 630, if it is determined that the keepalive frequency associated with the session entry is not greater than or equal to 100ms and less than 1s, a determination is made whether the keepalive frequency associated with the session entry is less than 100ms or greater than 1s (block 634). If is determined that the keepalive frequency associated with the session entry is greater than 1s, the bank assignment for the session entry is randomized across any of the ten banks (e.g., bank0-bank9) (block 638). Otherwise if it is determined that the keepalive frequency associated with the session entry is less than 100ms, the session entry may be given a bank assignment without randomization (block 636). In either case, after the bank assignment is determined, the keepalive network session may be added to the appropriate bank in block 640.

[0117] In the above exemplary process flow, the thresholds used in the processing are not intended to be limiting, and other threshold may be used in alternative embodiments.

[0118] The scheduler may be referred to as“self-healing” in the sense that any overdue keepalive transmissions may be corrected in a future scheduling cycle. The SKAP scheduler also provides many benefits. It provides for real-time scheduling that is deterministic in the sense that it is a simple static algorithm that is robust with low overhand and offers bounded max latency and jitter. It may only use one single monotonic OS timer. The scheduler may scale to a very high scale of periodic timers (e.g., tens of thousands) in addition to supporting variable timer periods and prioritized scheduling. Further, it may support variable timer periods, prioritized scheduling, and minimum CPU impact. Further, as described, the scheduler auto-adjusts for skew and prioritizes high frequency protocols.

[0119] It can be appreciated that while the SKAP architecture and scheduler described above is described with respect to specific examples, the SKAP architecture and scheduler can be extended to any control processor having a shared database accessible by multiple subsystems. For example, the control processor can be a part of one or more network processors.

[0120] In certain embodiments, a non-transitory machine-readable or computer-readable medium is provided for storing data and code (instructions) that can be executed by one or more processors. Examples of non-transitory machine-readable or computer-readable medium include memory disk drives, Compact Disks (CDs), optical drives, removable media cartridges, memory devices, and the like. A non-transitory machine-readable or computer- readable medium may store the basic programming (e.g., instructions, code, program) and data constructs, which when executed by one or more processors, provide the functionality described above. In certain implementations, the non-transitory machine-readable or computer-readable medium may be included in a network device and the instructions or code stored by the medium may be executed by one or more processors of the network device causing the network device to perform certain functions described above. In some other implementations, the non-transitory machine-readable or computer-readable medium may be separate from a network device but can be accessible to the network device such that the instructions or code stored by the medium can be executed by one or more processors of the network device causing the network device to perform certain functions described above. The non-transitory computer-readable or machine-readable medium may be embodied in non-volatile memory or volatile memory.

[0121] The methods, systems, and devices discussed above are examples. Various embodiments may omit, substitute, or add various procedures or components as appropriate. For instance, in alternative configurations, the methods described may be performed in an order different from that described, and/or various stages may be added, omitted, and/or combined. Features described with respect to certain embodiments may be combined in various other embodiments. Different aspects and elements of the embodiments may be combined in a similar manner. Technology evolves and, thus, many of the elements are examples that do not limit the scope of the disclosure to those specific examples. [0122] Specific details are given in this disclosure to provide a thorough understanding of the embodiments. However, embodiments may be practiced without these specific details. For example, well-known circuits, processes, algorithms, structures, and techniques have been shown without unnecessary detail in order to avoid obscuring the embodiments. This description provides example embodiments only, and is not intended to limit the scope, applicability, or configuration of other embodiments. Rather, the preceding description of the embodiments will provide those skilled in the art with an enabling description for implementing various embodiments. Various changes may be made in the function and arrangement of elements.

[0123] Although specific embodiments have been described, various modifications, alterations, alternative constructions, and equivalents are also encompassed within the scope of described embodiments. Embodiments described herein are not restricted to operation within certain specific data processing environments, but are free to operate within a plurality of data processing environments. Additionally, although certain implementations have been described using a particular series of transactions and steps, it should be apparent to those skilled in the art that these are not meant to be limiting and are not limited to the described series of transactions and steps. Although some flowcharts describe operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be rearranged. A process may have additional steps not included in the figure.

[0124] Further, while certain embodiments have been described using a particular combination of hardware and software, it should be recognized that other combinations of hardware and software may also be provided. Certain embodiments may be implemented only in hardware, or only in software (e.g., code programs, firmware, middleware, microcode, etc.), or using combinations thereof. The various processes described herein can be implemented on the same processor or different processors in any combination.

[0125] Where devices, systems, components or modules are described as being configured to perform certain operations or functions, such configuration can be accomplished, for example, by designing electronic circuits to perform the operation, by programming programmable electronic circuits (such as microprocessors) to perform the operation such as by executing computer instructions or code, or processors or cores programmed to execute code or instructions stored on a non-transitory memory medium, or any combination thereof. Processes can communicate using a variety of techniques including but not limited to conventional techniques for inter-process communications, and different pairs of processes may use different techniques, or the same pair of processes may use different techniques at different times.

[0126] The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. It will, however, be evident that additions, subtractions, deletions, and other modifications and changes may be made thereunto without departing from the broader spirit and scope as set forth in the claims. Thus, although specific embodiments have been described, these are not intended to be limiting. Various modifications and equivalents are within the scope of the following claims.