Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AN ARCHITECTURE AND ALGORITHM FOR A PROGRAMMABLE PIPELINE TO SUPPORT STATEFUL PACKET PROCESSING
Document Type and Number:
WIPO Patent Application WO/2023/220483
Kind Code:
A2
Abstract:
A programmable pipeline architecture that includes a backward bus coupled to a first PSP of a programmable pipeline and a second PSP of the programmable pipeline. The first PSP is configured to store a flow state table; receive a packet; read, from the flow state table, a state of a stateful function corresponding to a packet flow of the packet; and process the packet based on the state of the stateful function. The second PSP is configured to receive the packet subsequent to processing by the first PSP; determine that there is a change to the state of the stateful function; write back state update data to the first PSP in response to the change; and forward the packet. The backward bus configured to carry state update data and resubmitted packets from the second PSP to the first PSP.

Inventors:
SONG HAOYU (US)
Application Number:
PCT/US2023/032893
Publication Date:
November 16, 2023
Filing Date:
September 15, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
FUTURWEI TECH INC (US)
Attorney, Agent or Firm:
DIETRICH, William H. et al. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A pipeline architecture comprising: a first PSP of a programmable pipeline, the first PSP configured to: store a flow state table; receive a packet; read, from the flow state table, a state of a stateful function corresponding to a packet flow of the packet; and process the packet based on the state of the stateful function; a second PSP of the programmable pipelines, the second PSP configured to: receive the packet subsequent to processing by the first PSP; determine that there is a change to the state of the stateful function; write back state update data to the first PSP in response to the change; and forward the packet; and a backward bus coupled to the first PSP and the second PSP, the backward bus configured to carry the state update data from the second PSP to the first PSP.

2. The pipeline architecture according to claim 1, wherein the first PSP is further configured to: receive the state update data from the second PSP; and update the state of the stateful function in the flow state table from a first state to a second state based on the state update data.

3. The pipeline architecture according to any of claims 1-2, wherein the first PSP is further configured to add metadata to the packet, the metadata comprising a flow identifier of the packet flow, state data, and a packet sequence number of the packet in the packet flow.

4. The pipeline architecture according to any of claims 2-3, wherein the second PSP is further configured to: identify subsequent packets associated with the packet flow that were processed by the first PSP using the first state; and resubmit, using the backward bus, the subsequent packets to the first PSP for reprocessing by the first PSP using the second state.

5. The pipeline architecture according to any of claims 1-4, wherein the first PSP is further configured to process the subsequent packets ahead of new packets.

6. The pipeline architecture according to any of claims 1-5, wherein the second PSP is further configured to maintain a pending flow table (PFT) and use the PFT to identify the subsequent packets.

7. The pipeline architecture according to claim 6, wherein the PFT comprises a packet flow data for each packet flow processed by the second PSP, the packet flow data comprising a flow identifier, a sequence number, a timer, and a dirty7 flag.

8. The pipeline architecture according to any of claims 6-7, wherein a maximum size of the PFT (D) is equal to DI plus D2, wherein DI is a pipeline delay from the first PSP to the second PSP, and wherein the D2 is a backward bus delay from the second PSP to the first PSP.

9. The pipeline architecture according to any of claims 7-8, wherein the second PSP is further configured to set the dirty flag to indicate that there may be packets associated with the packet flow that read a stale state.

10. The pipeline architecture according to any of claims 7-9, wherein the second PSP is further configured to set the dirty flag to TRUE in response to the change to the state of the stateful function.

11. The pipeline architecture according to any of claims 7-10, wherein the second PSP is further configured to initialize the timer to the maximum size of the PFT (D) and decrement the timer by 1 every7 clock cycle.

12. The pipeline architecture according to any of claims 1-11, wherein the second PSP is further configured to remove the metadata prior to forw arding the packet.

13. The pipeline architecture according to any of claims 1-12, wherein the second PSP is further configured to add the packet flow data to the PFT when the PFT does not include the packet flow data corresponding to the packet flow of the packet.

14. The pipeline architecture according to any of claims 1-13, wherein the second PSP is further configured to delete the packet flow data from the PFT when the timer reaches zero.

15. A method of processing a packet comprising: storing, by a first pipeline stage processor (PSP) of a programmable pipeline, a flow state table; receiving, by the first PSP, a packet; reading, by the first PSP from the flow state table, a state of a stateful function corresponding to a packet flow of the packet; processing, by the first PSP, the packet based on the state of the stateful function; receiving, by a second PSP of the programmable pipeline and subsequent to processing by the first PSP, the packet; determining, by the second PSP, that there is a change to the state of the stateful function; writing back, by the second PSP, using a backward bus that communicatively couples the second PSP and the first PSP, state update data to the first PSP in response to the change; and forwarding, by the second PSP, the packet.

16. The method according to claim 15, further comprising: receiving, by the first PSP, the state update data from the second PSP over the backward bus; and updating, by the first PSP, the state of the stateful function in the flow state table from a first state to a second state based on the state update data.

17. The method according to any of claims 15-16, further comprising adding, by the first PSP, metadata to the packet, the metadata comprising a flow identifier of the packet flow, state data, and a packet sequence number of the packet in the packet flow.

18. The method according to any of claims 15-17, further comprising: identifying, by the second PSP, subsequent packets associated with the packet flow that were processed by the first PSP using the first state; resubmitting, by the second PSP and using the backward bus, the subsequent packets to the first PSP; and reprocessing, by the first PSP, the subsequent packets using the second state.

19. The method according to any of claims 15-18, further comprising reprocessing, by the first PSP, the subsequent packets using the second state prior to processing new packets.

20. The method according to any of claims 15-19, further comprising: maintaining, by the second PSP, a pending flow table (PFT); and identifying, by the second PSP, the subsequent packets using the PFT.

21. The method according to claim 20, wherein the PFT comprises a packet flow data for each packet flow processed by the second PSP, the packet flow data comprising a flow identifier, a sequence number, a timer, and a di rty flag.

22. The method according to claim 21. wherein a maximum size of the PFT (D) is equal to DI plus D2, wherein DI is a pipeline delay from the first PSP to the second PSP, and wherein the D2 is a backward bus delay from the second PSP to the first PSP.

23. The method according to claim 22, further comprising setting, by the second PSP, the dirty flag to indicate that there may be packets associated w ith the packet flow that read a stale state.

24. The method according to any of claims 21-23, further comprising setting, by the second PSP, the dirty flag to TRUE in response to the change to the state of the stateful function.

25. The method according to any of claims 21-24, further comprising: initializing, by the second PSP, the timer to the maximum size of the PFT (D); and decrementing, by the second PSP, the timer by 1 every clock cycle.

26. The method according to any of claims 15-25, further comprising removing, by the second PSP, the metadata prior to forwarding the packet.

27. The method according to any of claims 15-26, further comprising adding, by the second PSP, the packet flow data to the PFT when the PFT does not include the packet flow data corresponding to the packet flow of the packet.

28. The method according to any of claims 15-27, further comprising deleting, by the second PSP, the packet flow data from the PFT when the timer reaches zero.

29. A computer program product comprising computer-executable instructions stored on a non-transitory computer-readable storage medium, the computer-executable instructions when executed by a processor of an apparatus, cause the apparatus to perform a method according to any of claims 15-28.

Description:
An Architecture and Algorithm for a Programmable Pipeline to Support Stateful Packet Processing

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] None.

TECHNICAL FIELD

[0002] The present disclosure is generally related to the field of data communication fixed networks and, in particular, to an architecture and algorithm for a programmable pipeline to support stateful packet processing.

BACKGROUND

[0003] Programmable network devices are used to support in-network computing, network automation, and various network innovations. Such programmable network devices may include, for example, an application-specific integrated circuit (ASIC), a network processor (NP), a field programmable gate array (FPGA), and so on.

[0004] The programmable network devices can include a Protocol-Independent Switch Architecture (PISA) and can be programmed using the Programming Protocol-independent Packet Processors (P4) language. The main architecture is called Match Action Table (MAT) pipeline featuring a forward path. The MAT pipeline includes a series of match-action tables. The MAT pipeline compares the attributes of incoming network packets to the data fields of the match-action tables and applies predefined actions based on the match results such as forwarding to the packet to a particular port, modifying packet headers of the packet, or dropping the packet.

SUMMARY

[0005] The present disclosure provides various embodiments of an architecture and algorithm/method for a programmable chip that includes a programmable pipeline (or simply pipeline) that can support stateful packet processing across multiple stages of the pipeline. A programmable pipeline is a series of interconnected processors. Each processor in the programmable pipeline is programmed or configured to perform a stage or part of a process (e.g., each processor in the programmable pipeline may be configured to perform a stage or part of a process for processing a packet for routing in a network device). [0006] A first aspect relates to pipeline architecture comprising a first pipeline stage processor (PSP) of a programmable pipeline, a second PSP of the programmable pipeline, and a backward bus coupled to the first PSP and the second PSP. The backward bus is configured to carry state update data from the second PSP to the first PSP, and wherein the programmable pipeline is configured to process packets. The first PSP is configured to store a flow state table; receive a packet; read, from the flow state table, a state of a statefill function corresponding to a packet flow of the packet; and process the packet based on the state of the stateful function. The second PSP is configured to receive the packet subsequent to processing by the first PSP; determine that there is a change to the state of the stateful function; wri te back state update data to the first PSP in response to the change; and forward the packet.

[0007] Optionally, in a first implementation according to the first aspect, the first PSP is further configured to receive the state update data from the second PSP; and update the state of the stateful function in the flow state table from a first state to a second state based on the state update data.

[0008] Optionally, in a second implementation according to the first aspect or any implementation thereof, the first PSP is further configured to add metadata to the packet, the metadata comprising a flow identifier of the packet flow, state data, and a packet sequence number of the packet in the packet flow.

[0009] Optionally, in a third implementation according to the first aspect or any implementation thereof, the second PSP is further configured to identify subsequent packets associated with the packet flow that were processed by the first PSP using the first state; and resubmit, using the backward bus, the subsequent packets to the first PSP for reprocessing by the first PSP using the second state.

[0010] Optionally, in a fourth implementation according to the first aspect or any implementation thereof, the first PSP is further configured to process the subsequent packets ahead of new packets.

[0011] Optionally, in a fifth implementation according to the first aspect or any implementation thereof, the second PSP is further configured to maintain a pending flow table (PFT) and use the PFT to identify the subsequent packets.

[0012] Optionally, in a sixth implementation according to the first aspect or any implementation thereof, the PFT comprises a packet flow data for each packet flow processed by the second PSP, the packet flow data comprising a flow identifier, a sequence number, a timer, and a dirty flag. [0013] Optionally, in a seventh implementation according to the first aspect or any implementation thereof, a maximum size of the PFT (D) is equal to DI plus D2, wherein DI is a pipeline delay from the first PSP to the second PSP, and wherein the D2 is a backward bus delay from the second PSP to the first PSP.

[0014] Optionally, in an eight implementation according to the first aspect or any implementation thereof, the second PSP is further configured to set the dirty flag to indicate that there may be packets associated with the packet flow that read a stale state.

[0015] Optionally, in an eighth implementation according to the first aspect or any implementation thereof, the second PSP is further configured to set the dirty 7 flag to TRUE in response to the change to the state of the stateful function.

[0016] Optionally, in a ninth implementation according to the first aspect or any implementation thereof, the second PSP is further configured to initialize the timer to the maximum size of the PFT (D) and decrement the timer by 1 every clock cycle.

[0017] Optionally, in a tenth implementation according to the first aspect or any implementation thereof, the second PSP is further configured to remove the metadata prior to forwarding the packet.

[0018] Optionally, in an eleventh implementation according to the first aspect or any implementation thereof, the second PSP is further configured to add the packet flow data to the PFT when the PFT does not include the packet flow data corresponding to the packet flow of the packet.

[0019] Optionally, in a twelfth implementation according to the first aspect or any implementation thereof, the second PSP is further configured to delete the packet flow data from the PFT when the timer reaches zero.

[0020] A second aspect relates to a method of processing a packet. The method includes storing, by a first PSP of a programmable pipeline, a flow state table; receiving, by the first PSP, a packet; reading, by the first PSP from the flow state table, a state of a stateful function corresponding to a packet flow of the packet; processing, by the first PSP, the packet based on the state of the stateful function; receiving, by a second PSP of the programmable pipeline and subsequent to processing by the first PSP, the packet; determining, by the second PSP, that there is a change to the state of the stateful function; writing back, by the second PSP, using a backward bus that communicatively couples the second PSP and the first PSP, state update data to the first PSP in response to the change; and forwarding, by the second PSP, the packet.

[0021] Optionally, in a first implementation according to the second aspect, the method further includes receiving, by the first PSP, the state update data from the second PSP over the backward bus; and updating, by the first PSP, the state of the stateful function in the flow state table from a first state to a second state based on the state update data.

[0022] Optionally, in a second implementation according to the second aspect or any implementation thereof, the method further includes adding, by the first PSP, metadata to the packet, the metadata comprising a flow identifier of the packet flow, state data, and a packet sequence number of the packet in the packet flow.

[0023] Optionally, in a third implementation according to the second aspect or any implementation thereof, the method further includes identifying, by the second PSP, subsequent packets associated with the packet flow that were processed by the first PSP using the first state; resubmitting, by the second PSP and using the backw ard bus, the subsequent packets to the first PSP; and reprocessing, by the first PSP, the subsequent packets using the second state.

[0024] Optionally, in a fourth implementation according to the second aspect or any implementation thereof, the method further includes reprocessing, by the first PSP, the subsequent packets using the second state prior to processing new packets.

[0025] Optionally, in a fifth implementation according to the second aspect or any implementation thereof, the method further includes maintaining, by the second PSP, a pending flow table (PFT); and identifying, by the second PSP, the subsequent packets using the PFT.

[0026] Optionally, in a sixth implementation according to the second aspect or any implementation thereof, the PFT comprises a packet flow data for each packet flow processed by the second PSP, the packet flow data comprising a flow identifier, a sequence number, a timer, and a dirty flag.

[0027] Optionally, in a seventh implementation according to the second aspect or any implementation thereof, a maximum size of the PFT (D) is equal to DI plus D2, wherein DI is a pipeline delay from the first PSP to the second PSP, and wherein the D2 is a backward bus delay from the second PSP to the first PSP.

[0028] Optionally, in an eight implementation according to the second aspect or any implementation thereof, the method further includes setting, by the second PSP, the dirty flag to indicate that there may be packets associated with the packet flow that read a stale state.

[0029] Optionally, in an eighth implementation according to the second aspect or any implementation thereof, the method further includes setting the dirty' flag to TRUE in response to the change to the state of the stateful function.

[0030] Optionally, in a ninth implementation according to the second aspect or any implementation thereof, the method further includes initializing the timer to the maximum size of the PFT (D) and decrementing the timer by 1 every clock cycle. [0031] Optionally, in a tenth implementation according to the second aspect or any implementation thereof, the method further includes removing the metadata prior to forwarding the packet.

[0032] Optionally, in an eleventh implementation according to the second aspect or any implementation thereof, the method further includes adding the packet flow data to the PFT when the PFT does not include the packet flow data corresponding to the packet flow of the packet.

[0033] Optionally, in a twelfth implementation according to the second aspect or any implementation thereof, the method further includes deleting the packet flow data from the PFT when the timer reaches zero.

[0034] A third aspect relates a programmable network device comprising the pipeline architecture according to the first aspect or any implementation thereof.

[0035] A fourth aspect relates an apparatus comprising a memory configured to store instructions; and one or more processors coupled to the memory and configured to execute the instructions to cause the apparatus to perform the method according to the second aspect or any implementation thereof.

[0036] A fifth aspect relates a computer program product comprising computer-executable instructions stored on a non-transitory computer-readable storage medium, the computerexecutable instructions when executed by a processor of an apparatus, cause the apparatus to perform a method according to the second aspect or any implementation thereof.

[0037] For the purpose of clarity, any one of the foregoing embodiments may be combined with any one or more of the other foregoing embodiments to create a new embodiment within the scope of the present disclosure.

[0038] These and other features will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0039] For a more complete understanding of this disclosure, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts.

[0040] FIG. 1 is an example of a Protocol-Independent Switch Architecture (PISA).

[0041] FIG. 2 is a schematic diagram of an architecture of a programmable pipeline that may be used to execute a cross-stage stateful packet processing function (SPPF) according to an embodiment of the disclosure. [0042] FIG. 3 is a diagram of a cross-stage SPPF according to an embodiment of the disclosure.

[0043] FIG. 4 is a schematic diagram illustrating processing details performed by a stageread pipeline stage processor (PSP) of a cross-stage SPPF according to an embodiment of the disclosure.

[0044] FIG. 5 is a schematic diagram illustrating processing details performed by a stagewrite PSP of a cross-stage SPPF according to an embodiment of the disclosure.

[0045] FIG. 6 is a schematic diagram illustrating a state-transitions of a packet flow according to an embodiment of the disclosure.

[0046] FIG. 7 is a flowchart illustrating a packet processing method according to an embodiment of the disclosure.

[0047] FIG. 8 is a flowchart illustrating a packet processing method according to an embodiment of the disclosure.

[0048] FIG. 9 is a flowchart illustrating a packet processing method according to an embodiment of the disclosure.

[0049] FIG. 10 is a schematic diagram of a network device according to an embodiment of the disclosure.

DETAILED DESCRIPTION

[0050] It should be understood at the outset that although an illustrative implementation of one or more embodiments are provided below, the disclosed systems and/or methods may be implemented using any number of techniques, whether currently known or in existence. The disclosure should in no way be limited to the illustrative implementations, drawings, and techniques illustrated below, including the exemplary designs and implementations illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.

[0051] The present disclosure provides various embodiments of an architecture and algorithm/method for a programmable chip that includes a programmable pipeline (or simply pipeline) that can support stateful packet processing across multiple stages of the pipeline. A programmable pipeline is a plurality of processors arranged in series (i.e., connected in a linear or sequential fashion) and each processor can be programmed or configured to perform a particular stage of a multi-stage packet processing algorithm on packets passing through a network device having the programmable chip (e.g., a network switch). Each stage can be customized to perform specific actions (e.g.. packet classification, filtering, routing, etc.) based on atributes of a packet or other data. The pipeline allows for the concurrent execution of multiple instructions, and thus, enhancing overall performance of the network device. In an example embodiment, aMAT-based programmable pipeline is augmented to include a backward bus that enables data from a later stage of the pipeline to be passed back to an earlier stage of the pipeline. The augmented design MAT-based programmable pipeline further includes corresponding algorithms to support stateful functions that cross multiple pipeline stages.

[0052] FIG. 1 is an example of a Protocol-Independent Switch Architecture (PISA) 100. The PISA 100 may be implemented in network devices such network switches that are not tied to specific networking protocol. The PISA 100 enables the customization of packet processing within a switch. Network operators can define/configure (i.e., program) how packets should be processed based on specific criteria, which can be useful for various networking tasks, such as traffic filtering, load balancing, or implementing custom networking features.

[0053] As shown, the PISA 100 includes a traffic manager 102 disposed between an ingress pipeline 104 and an egress pipeline 106. The ingress pipeline 104 and the egress pipeline 106 are configured to handle different aspects of packet processing as packets enter and exit the network device/switch. The ingress pipeline 104 and the egress pipeline 106 of FIG. 1 each include a front-end parser 108, a plurality of pipeline stage processors (PSPs) 110, and a queue/buffer 112. The front-end parser 108 is configured to receive, parse, and craft packets (a.k.a., data packets, network packets, etc.). For example, in some embodiments, the front-end parser 108 is may be programmed or configured to parse a packet into individual headers for processing by the plurality of PSPs 110. The front-end parser 108 is configured to forward the packets to the plurality of PSPs 110.

[0054] In an embodiment, the plurality of PSPs 110 comprises multiple programmable PSPs. For simplicity reasons, two PSPs are shown in the plurality of PSPs 110. However, the plurality of PSPs 110 may contain any number of PSPs. In order to process the packets, each PSP in the plurality of PSPs 110 includes, for example, a memory 114 and an arithmetic logic unit (ALU) 116. The plurality of PSPs 110 or one or more individual PSP in the plurality of PSPs 110 may include other components in practical applications.

[0055] In some embodiments, the plurality of PSPs 110 may be programmed with certain data (e g., a flow state data table or other data tables) and a processing algorithm for the processing packets through the plurality of PSPs 110 based on the data Each PSP 110 in the plurality of PSPs 110 may be configured to perform a part (commonly referred to as a stage) of the packet processing. For example, as shown in FIG. 1, a packet may be processed using a two- stage packet processing by the plurality of PSPs 110. For instance, the PSP 110 immediately adjacent to the front-end parser 108 may be configured to receive the packet from the front-end parser 108 and perform a first stage of the two-stage packet processing. In some embodiments, a packet may simply be forw arded without processing in one or more stages of the plurality of PSPs.

[0056] After performing packet processing at the first stage, if any, the first PSP 110 forwards the packet to the next PSP 110 for the second stage of the packet processing. The PSP 110 in the second stage may or may not process the packets received from the PSP 110 configured to perform the first stage. This process continues in succession until all of the PSPs 110 in subsequent stages have either processed the packets or forwarded the packets to the next PSP 110 without performing any processing. The last PSP 110 in sequential order forwards the packets to the queue/buffer 112.

[0057] In some embodiments, deparsing may occur at or prior to the queue/buffer 112 stage to ensure that the outgoing packet is formatted correctly prior to forwarding. For example, the packet may be reconstructed or one or more packet headers may be added, modified, or removed during deparsing. The queue/buffer 112 temporarily stores the packets until the packets are transmitted.

[0058] The traffic manager 102 receives the packets from the queue/buffer 112 of the ingress pipeline 104. In some embodiments, the queue/buffer 112 is incorporated into, and forms part of, the traffic manager 102. In some embodiments, the traffic manager 102 is configured to queue, replicate, and schedule the transmission of the packets. The traffic manager 102 forwards the packets to the front-end parser 108 of the egress pipeline 106, which in turn forwards the packet or packet data after parsing to the plurality of PSPs 110 of the egress pipeline 106. The plurality 7 of PSPs 110 of the egress pipeline 106 may be configured to perform additional processing on the packet before transmitting the packet out of the network device.

[0059] One drawback to the architecture of the PISA 100 of FIG. 1 or other conventional MAT-based programmable pipelines is that the pipelines provide a forward processing path (i.e., the packet goes from a first stage of packet processing to a second stage, then to a third stage, and so on) and does not support cross-stage stateful packet processing functions (SPPFs). A statefiil packet processing function (SPPF) is a program or set of instructions that when executed is configured to determine a state associated with a packet (or a packet flow of the packet) and process the packet based on the determined state. A state can be any factor or variable that is used for processing the packet. For example, a state may be a counter and the packet processing may be based on whether the counter exceeds a certain value. A state may also be a Boolean variable to indicate whether some condition is true or false. Since the current MAT-based programmable pipelines provide a forward processing path, any stateful packet processing function has to be completed within a single stage/PSP because the current pipeline architecture does not support a backward path that enables data from a later stage/PSP in the pipeline to be carried back to an earlier stage/PSP of the pipeline. While many simple SPPFs can be completed in a single stage/PSP of the pipeline, there are some complex SPPFs that require multiple processing stages of a pipeline (i.e., cross-stage processing) before completion. In other words, a cross-stage SPPF is a function that involves multiple stages to determine a state update.

[0060] To handle these cross-stage SPPFs in a conventional MAT-based programmable pipeline, a network device may be programmed to delay the sending of a next packet into a pipeline until an earlier packet in the pipeline has completed the cross-stage SPPF. This ensures that the next packet is processed using the correct state in the event that the processing of the earlier packet in the pipeline causes a change to the state. However, this is an extremely inefficient use of the pipeline and increases latency (i.e., delay or the time it takes for data to travel) in the network. For example, assume the cross-stage SPPF takes 10 stages of a pipeline to complete and that each processing stage takes one clock cycle. In this example, after an initial packet enters the pipeline, a 10-clock cycle delay w ould occur for each subsequent packet. This delay or holding back of packets from entering the pipeline can severely limit the processing efficiency of the pipeline and increase netw ork latency.

[0061] To address this problem, embodiments of the present disclosure include an architecture and algorithm/method for a programmable pipeline that includes a backward bus that enables data from a later stage of the pipeline to be passed back to an earlier stage of the pipeline. Using the disclosed embodiments, packets that are processed using a cross-stage SPPF do not need to be held back from entering the pipeline until an earlier packet completes processing of the cross-stage SPPF. Instead, the disclosed programmable pipeline is configured to execute corresponding algorithms (or instructions) using one or more processors to support stateful functions that cross multiple pipeline stages (i.e., cross-stage stateful functions) so that data indicating a change to a state affecting the processing of packets, and any packets in the pipeline that were processed using an outdated/incorrect state, is carried back on the backward bus to an earlier processing stage/PSP of the pipeline where the state is read and used for processing the packet. Thus, the disclosed embodiments support and improve pipeline performance efficiency of cross-stage SPPFs.

[0062] FIG. 2 is a schematic diagram of an architecture of a programmable pipeline 200 that may be used to execute a cross-stage SPPF according to an embodiment of the disclosure. The programmable pipeline 200 includes a plurality of PSPs including a PSP 202 (labeled PSP (SR) for stage-read) and a PSP 204 (labeled PSP (SW) for stage-write). The cross-stage SPPF is mapped to multiple pipeline stages of the programmable pipeline 200 starting from PSP 202 to the PSP 204.

[0063] A backward bus 206 is coupled to the PSP 202 and the PSP 204. In some embodiments, the backward bus 206 may also be coupled to other PSPs (not shown) in the programmable pipeline 200. The backward bus 206 enables data from the PSP 204 to be carried back to the PSP 202. The programmable pipeline 200 may include other SR PSPs and SW PSPs that correspond to the stage-read and the stage-write, respectively, of other cross-stage SPPFs implemented by the programmable pipeline 200.

[0064] In the depicted embodiment, the PSP 202 is the PSP in the programmable pipeline 200 that is configured to read a state (e.g., from a state data table stored at PSP 202) and use the state in a cross-stage SPPF for processing a packet through the programmable pipeline 200. The PSP 202 may or may not be the first PSP of the programmable pipeline 200. The PSP 202 is coupled to the PSP 204 by a forward pipeline path 208. The forward pipeline path 208 may include additional PSPs (not shown) between the PSP 202 and the PSP 204 that are configured to perform one or more stages of processing on the packet. Data is forwarded in one direction from the PSP 202 towards the PSP 204 on the forward pipeline path 208 as shown in FIG. 2. The PSP 204 is the PSP in the programmable pipeline 200 that is configured to determine whether the state read at the PSP 202 has changed based on the processing of the packet using the crossstage SPPF. The PSP 204 may or may not be the last PSP of the programmable pipeline 200. As further described below, the PSP 204 is configured to perform one or more actions when there is a state change.

[0065] FIG. 3 is a diagram of a cross-stage SPPF 300 according to an embodiment of the disclosure. In an embodiment, the cross-stage SPPF 300 is implemented by the programmable pipeline 200 of FIG. 2. For instance, the PSP 202 of the programmable pipeline 200, may be configured, at a stage-read PSP mapped to the cross-stage SPPF 300, to receive a packet and read (i.e., perform a lookup) a flow state table (FST) 302 to determine a current state data associated with the packet or a packet flow corresponding to the packet for processing the packet through the pipeline. In an embodiment, the FST 302 stores data on a cunent state of packet flows that are processed through the pipeline. The FST 302 may also store a sequence number (seq num) indicating an order of the packet in the packet flow; In an embodiment, the data in the FST 302 is indexed by flow identifiers (IDs). In an embodiment, the PSP 202 is configured to extract data from the packet (e.g.. source/destination Internet Protocol (IP) addresses, source/destination port numbers, and transport protocol) to determine a flow ID of the packet for identifying the packet flow data in the FST 302 corresponding to the packet flow of the packet. In some embodiments, the PSP 202 may be configured to perform some initial processing on the packet based on the data stored in the FST 302 corresponding to the packet. For example, in some embodiments, after reading the FST 302, the PSP 202 is configured to add metadata to the packet. The metadata may include, but is not limited to, the flow ID, state data, and sequence number of the packet. The packet containing the metadata is then forwarded along the forward pipeline path 208 for further processing. Additional processing details performed by a stage-read PSP of a cross-stage SPPF are described below in reference to FIG. 4.

[0066] The metadata may be used in one or more stages of the cross-stage SPPF 300 for processing the packet. For example, in the depicted embodiment, the metadata may be used to perform a state calculation 304. The state calculation 304 may be implemented by an algorithm that is configured to identify a change to a state (i.e., state data) that is used by the cross-stage SPPF 300 for processing packets of the packet flow corresponding to the packet. In an embodiment, the state calculation 304 process may span across multiple PSPs of the programmable pipeline 200. For example, each PSP may be configured to perform a portion or stage of the state calculation 304.

[0067] Once the state calculation 304 is completed, a stage-write PSP mapped to the crossstage SPPF 300 (e.g., the PSP 204 of the programmable pipeline 200) is configured to perform, based on the results of the state calculation 304, one or more actions and if necessary, perform a state update 306. For example, in an embodiment, if the PSP 204 determines, based on the results of the state calculation 304, that there is a change to a state used by the cross-stage SPPF 300 for processing packets of the packet flow corresponding to the packet, the PSP 204 is configured to send or writeback 308, e.g., using the backward bus 206, state update data that indicates the change to the state. The PSP 202 (or a stage-read PSP mapped to the cross-stage SPPF 300) receives the state update data and updates, based on the state update data, the state data in the FST 302 corresponding to the packet flow of the packet for subsequent packet processing.

[0068] Additionally, in some embodiments, the PSP 204 is configured to identify packets, if any, currently in the programmable pipeline 200 (i.e., packets that entered the pipeline after the packet that caused the state to change and were processed using the stale/incorrect state data. In an embodiment, the PSP 204 is configured to send, using the backward bus 206, any identified packets back to the PSP 202 for reprocessing using the new state. Additional processing details performed by a stage-write PSP of a cross-stage SPPF are described below' in reference to FIG. 5. [0069] FIG. 4 is a schematic diagram illustrating processing details performed by a stageread PSP of a cross-stage SPPF according to an embodiment of the disclosure. In an embodiment, the stage-read PSP is the first PSP of aplurality of PSPs in apipeline that is mapped to the cross-stage SPPF. For example, if a pipeline includes PSP1 to PSP20, and the cross-stage SPPF is mapped to PSP4 to PSP10 of the pipeline, then the stage-read PSP would be PSP4. As described above, the stage-read PSP is configured to read a state or state data from a data table stored by the stage-read PSP to obtain state data for processing the packet using the cross-stage SPPF. For example, as shown in FIG. 4, the stage-read PSP is configured to receive one or more pipeline packets (PPs) 402 for processing. In some embodiments, the pipeline packet 402 may be placed in a processing queue or buffer when the stage-read PSP is busy processing other packets. In the depicted embodiment, assuming that the stage-read PSP is available to process a pipeline packet 402, the stage-read PSP reads state data from a FST 404 to obtain state data of a packet flow corresponding to the packet. In an embodiment, the FST 404 stores data corresponding to packet flows based on (i.e., indexed by) a flow ID. For example, using a flow ID generated from data extracted from a packet, the stage-read PSP can obtain, from the FST 404, state data and a sequence number (seq_num) associated with the flow ID. The sequence number indicates an order of the packet in the packet flow . The stage-read PSP increments the sequence number by 1 in the FST 404 for each packet of the packet flow that is processed by the stage-read PSP. As shown in FIG. 4, the data obtained from the FST 404 (flow ID, state data, secLnum) is added as metadata to the packet 406. The packet 406 is then forwarded to a next PSP stage mapped to the cross-stage SPPF for further processing.

[0070] Additionally, FIG. 4 illustrates the stage-read PSP receiving state writeback (SWB) 408 (i.e., state update data that is written back, using a backw ard bus, from a stage- write PSP of a cross-stage SPPF to the stage-read PSP). The stage-read PSP updates the state data corresponding to the flow ID of the packet flow of the packet in the FST 404 based on the SWB 408. Further, FIG. 4 illustrates the stage-read PSP receiving a resubmitted packet (RP) 410, on the backw ard bus, from the stage- write PSP. The RP 410 is a packet that w as processed through the cross-stage SPPF after the packet that caused the state data of the packet flow to change and thus, was processed using stale state data because the state data in the FST 404 had not been updated to reflect the state change. Although the SWB 408 and the RP 410 are both transmitted on the same backw ard bus, the SWB 408 never conflicts with the RP 410.

[0071] In an embodiment, the stage-read PSP is configured to assign a higher processing priority to the RP 410 than the pipeline packet (PP) 402. The stage-read PSP buffers any pipeline packets (PPs) 402 and processes the RP 410 immediately, or as soon as possible, ahead of the PP 402. For the RP 410, the stage-read PSP reads the FST 404 to obtain any updated state data corresponding to the packet flow of the RP 410 and updates the state data metadata in the RP 410 with the updated state data. The stage-read PSP maintains/keeps the existing sequence number already in the metadata of the RP 410 to maintain the correct sequence of packets corresponding to the packet flow. The RP 410 is then forwarded to the next PSP stage mapped to the crossstage SPPF for further reprocessing.

[0072] FIG. 5 is a schematic diagram illustrating processing details performed by a stagewrite PSP of a cross-stage SPPF according to an embodiment of the disclosure. In an embodiment, the stage-write PSP is the last PSP of a plurality of PSPs in a pipeline that is mapped to the cross-stage SPPF. For example, if a pipeline includes PSP1 to PSP20, and the cross-stage SPPF is mapped to PSP4 to PSP 10 of the pipeline, then the stage-write PSP would be PSP 10. [0073] In the depicted embodiment, the stage-write PSP includes a pending flow table (PFT) 502, a decision engine 504, and an action 506 component. The PFT 502 stores data on packet flows corresponding to packets that are resubmitted for processing (e.g., RP 410). In an embodiment, the PFT 502 has one packet flow entry for all resubmitted packets that are from the same packet flow. In an embodiment, the maximum size of the PFT 502 is equal to the total latency (D) associated with the cross-stage SPPF. The total latency (D) is equal to DI plus D2, where DI is a pipeline delay from the stage-read PSP to the stage- write PSP of the cross-stage SPPF, and D2 is a backward bus delay from the stage-write PSP to the stage-read PSP of the cross-stage SPPF. The maximum size of the PFT 502 occurs when each stage/ clock cycle of the cross-stage SPPF is currently processing a resubmitted packet from different packet flows (i.e., every packet currently being processed in the cross-stage SPPF is associated with a different packet flow) because the PFT 502 would have to include a packet flow' entry' for each resubmitted packet currently in process. However, typically some of the resubmitted packets belong to a same packet flow;

[0074] In an embodiment, each packet flow entry' in the PFT 502 includes a sequence number (seq_num), a ditty' flag (dirty_flag), a timer for each packet flow' currently being processed by the cross-stage SPPF. In an embodiment, the packet flow entry may be indexed by the flow ID of the packet flow (e.g., flow ID seq num. dirty flag, timer). The sequence number indicates an order of the packet in the packet flow. In an embodiment, the dirty flag is a Boolean variable that is used to indicate whether packets associated with a packet flow' should be resubmitted for processing. In an embodiment, the timer variable may be used to determine when a packet flow entry can be removed/deleted from the PFT 502 as further described below. [0075] In an embodiment, the decision engine 504 is configured to utilize and update the data in the PFT 502. For example, in an embodiment, the decision engine 504 is configured to determine whether to perform a SWB 408 based on whether a packet causes a state update. For instance, when a packet causes a state update, the decision engine 504 generates the SWB 408 and modifies the PFT 502 to include a flow entry 7 of a packet flow corresponding to the packet that cause the state update. The decision engine 504 also utilizes the data in the PFT 502 to determine whether a packet is to be resubmitted and to monitor a resubmitted packet. For example, in an embodiment, the decision engine 504 resubmits a packet for reprocessing based on the sequence number and on the dirty 7 flag being set to TRUE or 1. Additionally, in an embodiment, when a new packet flow entry is added to the PFT 502 or when a packet is resubmitted for processing, the timer for the packet flow entry can be initialize to a value equal to the total latency (D) associated with the cross-stage SPPF (e.g., Init value = D). The timer is then decremented by 1 for each clock cycle. Thus, any resubmitted packet should complete reprocessing by D cycles (i.e., by the time the timer is 0). When the timer is 0 for a packet flow entry in the PFT 502, this means that there no resubmitted packets corresponding to the packet flow that are still processing, and that the packet flow entry in the PFT 502 can be deleted. Additional processing details performed by 7 the decision engine 504 are described below in reference to FIG. 6 and FIG. 7.

[0076] The action 506 component is configured to perform any action needed to complete the processing or forwarding of a packet at the stage-write PSP when the decision engine 504 determines that the packet does not need to be resubmitted. For example, in some embodiments, the action 506 component may be configured to remove metadata that was previously added to the packet at the stage-read PSP, add/modify /delete one or more headers, or format the packet for forwarding. In some embodiments, the decision engine 504 and the action 506 component may be combined.

[0077] FIG. 6 is a schematic diagram illustrating a state-transitions of a packet flow according to an embodiment of the disclosure. In an embodiment, the decision engine 504 of FIG. 5 may determine a state of a packet flow corresponding to packets processed at the stagewrite PSP of a cross-stage SPPF. In the depicted embodiment, a packet flow is in one of the following three states: an idle state, a dirty state, or a clean state. The idle state means that a packet flow corresponding to the packet being processed at the stage-write PSP is not in the PFT 502 (i.e., no packets in the packet flow caused state update). Every packet flow- is initially in the idle state. The dirty state means that a packet flow has a flow entry in the PFT 502 due to a packet in the packet flow causing a state update and that there may be packets of the packet flow that need to be resubmitted (i.e., there may be packets of the packet flow with stale state). The clean state means that a packet flow has a flow entry in the PFT 502 due to a packet in the packet flow causing a state update and that there are no packets of the packet flow that still need to be resubmitted (i.e., all packets of the packet flow with stale state have been resubmitted).

[0078] In reference to FIG. 6, Table 1 provides the conditions and actions for transitioning from a current state to a next state according to an embodiment of the disclosure. The next state transition number indicated in parenthesis (#) corresponds to the numbered transitions illustrated in FIG. 6.

TABLE 1

[0079] In T able 1 , /represents a flow (or packet flow) and p(f) represents a packet of the flow

/ In an embodiment, the decision engine 504 is configured to perform action when a packet of a flow ///) is processed at the stage- write PSP and a search of the PFT 502 is performed, or when there is a timer timeout for the flow/ For instance, as shown in T able 1 , when a packet of a flow p(f) is processed and the packet p(f) does not cause a state update (i.e. , the packet does not cause one or more state data used for processing packets of the packet flow to change) and there is no flow entry for the flow/in the PFT 502, then the decision engine 504 is configured to remove the metadata from the packet (i.e., the metadata that was at the stage-read PSP) and forward the packet. As shown in Table 1 and in FIG. 6, flow / remains in the idle state (Idle (1) in FIG. 6).

[0080] However, when a packet of a flow p(f) is processed and the packet p(f) does cause a state update (i.e., the packet causes one or more state data used for processing packets of the packet flow to change) and there is no flow entry for the flow /in the PFT 502, then the decision engine 504 is configured to add a flow entry for /to PFT 502, set the sequence number of the flow entry for /to the sequence number of the packet plus (f.seq num = p.seq_num+ 1), set the timer of the flow entry for /to D (i.e., max clock cycles needed to reprocess the packet), set the dirtj 7 flag of the flow entry for / to TRUE to indicate that there may be packets of the packet flow that need to be resubmitted, generate SWB 408 indicating the change to the one or more state data used for processing packets of the packet flow to change, and even though this packet caused the state update, since the packet was processed using the correct state data, the decision engine 504 is configured to remove the metadata from the packet and forward the packet. As shown in Table 1 and FIG. 6, flow/is now in the dirty state (Dirty (2) in FIG. 6).

[0081] As shown in Table 1. when another packet of the same flow is processed (i.e., new p(f) the decision engine 504 is configured to compare the sequence number of new packet to the sequence number of flow entry in the PFT 502. As stated above, the sequence number of the flow entry for /is set to the sequence number of the packet that caused the state update plus 1 (/.seq num =p.seq _num+l). This means that the sequence number of the flow entry for/in the PFT 502 is set to the sequence number of the next packet in the flow after the packet that cause the state update. When the new packet arrives and the flow/is in a dirty state (i. e. , the dirty flag of the flow entry in the PFT 502 is set to TRUE) and the sequence number of new packet is not equal to the sequence number of the flow entry for/in the PFT 502 (p(f).seq num !=/seq_num), then the decision engine 504 is configured to resubmit the packet and reset to the time for flow to D (f. timer = D). The state of the flow/ remains in a di rty state (Dirty (4) in FIG. 6).

[0082] When the new packet arrives and the flow /is in a dirty state and the sequence number of the new packet is equal to the sequence number of the flow entry for / in the PFT 502 (p(f).seq_num ==/seq_num), then the decision engine 504 is configured to resubmit the packet, reset to the time for flow to D ( .timer = D), and set the dirty flag of the flow entry in the PFT 502 is set to FALSE (/.dirty = False) to put the flow state in the clean state indicating that all packets of the packet flow with stale state have been resubmitted (Clean (3) in FIG. 6).

[0083] When the new packet arrives and the flow/is in a clean state (i.e., the dirty flag of the flow entry in the PFT 502 is set to FALSE) and the sequence number of the new packet is not equal to the sequence number of the flow entry for / in the PFT 502 (p(f).seq_num 1= /.seq num). then the decision engine 504 is configured to resubmit the packet and reset to the time for flow to D (/.timer = D). The state of the flow/ remains in a clean state (Clean (7) in FIG. 6).

[0084] When the new packet arrives and the flow /is in a clean state, the sequence number of the new packet is equal to the sequence number of the flow entry for / in the PFT 502 (p(f).seq_num == /.seq num), and the packet does not cause a state update, then the decision engine 504 is configured to increase the sequence number of the flow entry for /in the PFT 502 by one (/.seq num + = 1), remove the metadata from the packet, and forward the packet. The flow/ remains in a clean state (Clean (6) in FIG. 6).

[0085] However, when the new packet arrives and the flow / is in a clean state, and the sequence number of the new packet is equal to the sequence number of the flow entry’ for/in the PFT 502 (p(f).seq_num == /.seq num), but the packet causes a state update, then the decision engine 504 is configured to increase the sequence number of the flow entry for /in the PFT 502 by one (/.seq num + = 1), set the dirty flag of the flow entry for / to TRUE to indicate that there may be packets of the packet flow that need to be resubmitted, set the timer of the flow entry for /to D, generate S WB 408 indicating the change to the one or more state data used for processing packets of the packet flow to change, and even though this packet caused another state update, since the packet was processed using the correct state data, the decision engine 504 is configured to remove the metadata from the packet and forward the packet. The state of the flow /is back in a dirty state (Dirty (8) in FIG. 6).

[0086] Additionally, as shown in Table 1 and in FIG. 6, when a timer for a flow entry' for f in the PFT 502 reaches 0 (i.e.,/ timeout), which indicates that all resubmitted packets for /have completed processing, then the decision engine 504 is configured to remove/delete the flow entry for/in the PFT 502. This results in the flow /being placed back in an idle state (Idle (5) and Idle (9) in FIG. 6).

[0087] As shown above, in some cases, a resubmitted packet may cause another state update (e.g., Dirty (8) in FIG. 6) that may cause any resubmitted packets or new packets that entered the pipeline for processing following the resubmitted packet that cause the state update to have to be resubmitted again. This may cause an inefficient processing loop to occur. Thus, in some embodiments, when there are resubmitted packets of a flow in the pipeline, new packets from the same flow or in some cases, even resubmitted packets from the same flow, are held/queued from entering the pipeline until a resubmitted packet of the flow has cleared the pipeline (e.g., wait D cycles) so that those packets do not cause an inefficient processing loop. In an embodiment, the stage-read PSP may implement or store a blocking table (BT) with the flows to block from entering/re-entering the pipeline until a resubmitted packet or resubmitted packets of flow have all cleared. In these embodiments, state transitions 4 and 7 in FIG. 6 would be eliminated because the resubmitted packet would always be the next packet of the flow to be processed (i.e., pffi.seq num == f.seq num would always be true).

[0088] FIG. 7 is a flowchart illustrating a packet processing method 700 according to an embodiment of the disclosure. The packet processing method 700 is an example of an implementation corresponding to the conditions/actions specified in Table 1. In an embodiment, the packet processing method 700 may be implemented by a decision engine such as the decision engine 504 of FIG. 5.

[0089] The method 700 starts, at step 702, by receiving a pack of a flow. The decision engine determines, at step 704, whether there is a flow entry for the flow in the PFT (e.g., PFT 502). When there is no flow entry for the flow in the PFT, the decision engine determines, at step 706, whether the packet causes a state update. If the packet causes a state update, the decision engine, at step 708, adds a flow' entry' for the flow in the PFT. At step 710, the decision engine sets the sequence number of the flow entry to the sequence number of the packet plus 1, sets the timer of the flow entry to D (i.e., max clock cycles needed to reprocess the packet), sets the dirty flag of the flow entry' to TRUE to indicate that there may be packets of the packet flow that need to be resubmitted, generates a state update writeback indicating the change to the one or more state data used for processing packets of the packet flow to change. At step 712, the decision engine removes the metadata from the packet and forwards the packet. If, at step 706, the decision engine determines that the packet does not cause a state update, the decision engine proceeds to step 712, where the metadata is removed and the packet is forwarded.

[0090] If at step 704, the decision engine determines that there is a flow entry- for the floyv in the PFT, the decision engine determines, at step 714, whether the sequence number of the packet is equal to the sequence number of the floyv entry-. If the sequence number of the packet is not equal to the sequence number of the floyv entry, the decision engine, at step 720, resubmits the packet and resets the time to D (i.e.. max latency/clock cycle). If the sequence number of the packet is equal to the sequence number of the flow entry, the decision engine, at step 716, determines whether the state of the floyv is dirty. If the state of the floyv is dirty, then the decision engine at step 718, sets the dirty flag ofthe Howto FALSE (i.e., set the state to clean) and, at step 720, resubmits the packet and resets the time to D (i.e., max latency/clock cycle). If, at step 716, the decision engine determines that the flow is not dirty (i.e., flow is in a clean state), the decision engine determines at step 722, whether the packet causes a state update. If the packet causes a state update, the decision engine performs steps 710 and 712 as described above. If the packet does not cause a state update, the decision engine, at step 724, increases the sequence number of the flow by 1, and then proceeds to step 712, where the metadata is removed and the packet is forwarded.

[0091] Additionally, at any time, periodically, or any time the PFT is searched, the decision engine may determine at step 726 whether a flow has timed out (i.e., timer=0) and if so, the decision engine, at step 730, deletes the flow entry of the floyv from the PFT.

[0092] FIG. 8 is a flo vchart illustrating a packet processing method 800 according to an embodiment of the disclosure. In an embodiment, the packet processing method 800 may be implemented by a PSP of a programmable pipeline such as the PSP 202 of the programmable pipeline 200 of FIG. 2 or the stage-read PSP of FIG. 4. The method 800, at step 802, begins with the PSP storing an FST. At step 804, the PSP receives a packet. The PSP. at step 806, reads, from the FST, a state of a stateful function corresponding to a packet floyv of the packet. At step 810, the PSP processes the packet based on the state of the stateful fiinction. In some embodiments, processing the packet may include adding metadata to the packet. The metadata may include a flow identifier of the packet flow, state data, and a packet sequence number of the packet in the packet flow. Optionally, the PSP may receive the state update data from a second PSP and update the state of the stateful function in the FST from a first state to a second state based on the state update data. Optionally, after updating of the state of the stateful function in the FST, the PSP may receive one or more resubmitted packets from the second PSP and reprocess the one or more resubmitted packets using the updated state.

[0093] FIG. 9 is a flowchart illustrating a packet processing method 900 according to an embodiment of the disclosure. In an embodiment, the packet processing method 900 may be implemented by a PSP of a programmable pipeline such as the PSP 204 of the programmable pipeline 200 of FIG. 2 or the stage-write PSP of FIG. 5. The method 900, at step 902, begins with the PSP receiving a packet for processing that was previously processed at stage-read PSP of the pipeline. At step 904, the PSP determines whether the packet caused a change to the state of the stateful function. If the packet caused a change to the state of the stateful function, the PSP, at step 906, writes back, using a backward bus between the PSP and to the stage-read PSP, state update data to the stage-read PSP indicating the change to the state of the stateful function. At step 908, the PSP forwards the packet. Optionally, prior to forwarding the packet, the PSP may remove metadata from the packet. Optionally, the PSP may identify subsequent packets associated with the packet flow that were processed by the stage-read PSP using a stale state and resubmit, using the backward bus, the subsequent packets to the stage-read PSP for reprocessing. [0094] FIG. 10 is a schematic diagram of a network device 1000 (e.g., a programmable network device) according to an embodiment of the disclosure. The network apparatus 1000 is suitable for implementing the disclosed embodiments as described herein. The network device 1000 comprises ingress ports/ingress means 1010 and receiver units (Rx)Zreceiving means 1020 for receiving data; a processor, logic unit, or central processing unit (CPU)/processing means 1030 to process the data; transmitter units (Tx)/transmitting means 1040 and egress ports/egress means 1050 for transmitting the data; and a memory/memory means 1060 for storing the data. The network device 1000 may also comprise optical-to-electrical (OE) components and electrical -to-optical (EO) components coupled to the ingress ports/ingress means 1010, the receiver units/receiving means 1020, the transmitter units/transmitting means 1040, and the egress ports/egress means 1050 for egress or ingress of optical or electrical signals.

[0095] The processor/processing means 1030 is implemented by hardware and software. The processor/processing means 1030 may be implemented as one or more CPU chips, cores (e.g., as a multi-core processor), field -programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), and digital signal processors (DSPs). The processor/processing means 1030 is in communication with the ingress ports/ingress means 1010, receiver units/receiving means 1020, transmitter units/transmitting means 1040, egress ports/egress means 1050, and memory/memory means 1060. The processor/processing means 1030 comprises a programmable pipeline stateful packet processing module 1070. The programmable pipeline stateful packet processing module 1070 is able to implement the methods disclosed herein. The inclusion of the programmable pipeline stateful packet processing module 1070 therefore provides a substantial improvement to the functionality of the network device 1000 and effects a transformation of the network device 1000 to a different state. Alternatively, the programmable pipeline stateful packet processing module 1070 is implemented as instructions stored in the memory/memory means 1060 and executed by the processor/processing means 1030.

[0096] The network device 1000 may also include input and/or output (I/O) devices/I/O means 1080 for communicating data to and from a user. The I/O devices I/O means 1080 may include output devices such as a display for displaying video data, speakers for outputting audio data, etc. The I/O devices I/O means 1080 may also include input devices, such as a keyboard, mouse, trackball, etc., and/or corresponding interfaces for interacting with such output devices. [0097] The memory/memory means 1060 comprises one or more disks, tape drives, and solid-state drives and may be used as an over-flow data storage device, to store programs when such programs are selected for execution, and to store instructions and data that are read during program execution. The memory/memory means 1060 may be volatile and/or non-volatile and may be read-only memory (ROM), random access memory (RAM), ternary' content-addressable memory (TCAM), and/or static random-access memory (SRAM).

[0098] While several embodiments have been provided in the present disclosure, it may be understood that the disclosed systems and methods might be embodied in many other specific forms without departing from the spirit or scope of the present disclosure. The present examples are to be considered as illustrative and not restrictive, and the intention is not to be limited to the details given herein. For example, the various elements or components may be combined or integrated in another system or certain features may be omitted, or not implemented.

[0099] In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, components, techniques, or methods without departing from the scope of the present disclosure. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and may be made without departing from the spirit and scope disclosed here. [00100] While several embodiments have been provided in the present disclosure, it may be understood that the disclosed systems and methods might be embodied in many other specific forms without departing from the spirit or scope of the present disclosure. The present examples are to be considered as illustrative and not restrictive, and the intention is not to be limited to the details given herein. For example, the various elements or components may be combined or integrated in another system or certain features may be omitted, or not implemented.

[00101] In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, components, techniques, or methods without departing from the scope of the present disclosure. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and may be made without departing from the spirit and scope disclosed herein.