Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PROGRAMMABLE SORT ORDER ALGORITHM WITH GRAPHICAL USER INTERFACE
Document Type and Number:
WIPO Patent Application WO/2014/106108
Kind Code:
A1
Abstract:
A programmable sorting algorithm for computers or electronic hardware with the capability of sorting lists of objects in any possible sort order combination according to a user supplied programming list whose sequence can determine the sort order. Items in the list to sort but not on the programming list can be ignored or each distinct type grouped together and placed at the beginning or end of the sorted list. The input data can be separated, filtered and processed according to user commands. The output data is user selectable and programmable including the selection of the sorted items themselves or their original index in the list before it was sorted which is useful for sorting tables with multiple columns by a selected column.

Inventors:
MASHACK PETER (US)
Application Number:
PCT/US2013/078088
Publication Date:
July 03, 2014
Filing Date:
December 27, 2013
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MASHACK PETER (US)
International Classes:
G06F9/45
Foreign References:
US20080141227A12008-06-12
US7620697B12009-11-17
US20110282758A12011-11-17
US20110098998A12011-04-28
US20100146004A12010-06-10
Download PDF:
Claims:
Class Claims

Claim 1 : The algorithm is implemented as a class.

Claim 2: The algorithm class encapsulates, extends, and improves existing sort algorithms by enabling the use of any objects that can be hosted in a class including, interfaces, classes, structs, properties, fields, events, and enumerations not available to existing sort algorithms.

Claim 3: The algorithm can sort lists of any types of objects.

Claim 4: The algorithm can sort lists containing more than one type of object.

Claim 5: The algorithm output data is selectable and the sort order is programmable.

Claim 6: The algorithm class contains user programmable methods that can identify and select data by any criteria, along with corresponding output data lists to store the data and optionally remove it from sort processing and filter it from the 'Sort Result List'.

Claim 7: The algorithm class contains user selectable and user programmable data output methods that can return any data type, class, data view, filtered data, data combination in any order, with a programmable sort order, as the sorted objects themselves or their original indexes, repeatedly, on user demand, without having to re-run the algorithm.

Claim 8: The algorithm class enables sort stability in non-stable core algorithms'. ' ProgrammingList' claims

Claim 9: The algorithm has a ' ProgrammingList' class constructor parameter that can be single or multi-dimensional and consist of any file, list, array, or collection of any type or types of objects or classes providing a user defined sort order programming object that can specify any possible sort order combination, column and property selection, grouping, filtering, and other sort refining commands to the sort algorithm.

Claim 10: The algorithm sort order can be programmed by the sequence of items in the ' ProgrammingList'.

Claim 1 1 : The sort algorithm can contain built-in ProgrammingLists selected with a

' ProgrammingList' class constructor parameter enumeration.

ListToSort' claims

Claim 12: The algorithm class has a ListToSort' constructor parameter that can be single or multi-dimensional and consist of any file, list, array, or collection of any type or types of objects or classes for accepting lists of objects to sort.

Claim 13: The ' ListToSort' can be sorted, filtered and grouped according to any of its dimensions if multidimensional and any of its items properties or fields if the item is a class by using ' ProgrammingList' data.

Claim 14: The sort algorithm can contain built-in 1 ListsToSort' selected with a ListToSort' class constructor parameter enumeration. 'Alien Input Items' claims

Claim 1 5: The algorithm identifies and accepts 'Alien Input Items' and provides user selectable options to choose the methods used to handle them.

Claim 1 6: 'Alien Input Items' can be removed from the sort process as they are discovered and not returned with output data.

Claim 1 7: 'Alien Input Items' can be assigned a sort weight and appended to the

'ProgrammingList' and objects that derive from it.

Claim 1 8: 'Alien Input Items' can be assigned the lightest or heaviest sort order weights with the choice being user programmable via a class constructor parameter enumeration or class property.

Claim 1 9: An Alien Input Items' list can be collected and returned to the user by an output method.

Error Tolerant Claims

Claim 20: The algorithm has error handling, is error tolerant and can complete the algorithm in the presence of errors and return reports of all errors.

Claim 21 : The algorithm tests for and handles class constructor parameter errors such as null and zero length constructor parameters, duplicate ProgrammingList' items and provides the user with constructor parameter error feedback via an error property and event.

Claim 22: The algorithm handles errors from 'Input Items' tnat throw exceptions, collects the 'Input Items' and their errors into an error list that can be returned to the user by an output method and continues running the algorithm. 'InputltemWrapperClass' claims

Claim 23: The sort algorithm class contains an 'InputltemWrapperClass' that

encapsulates and buffers, 'Inputltems', 'ProgrammingLists' and user supplied inputs or their built-in counterparts by handling their unique details and requirements and isolating them from the sort algorithm.

Claim 24: InputltemWrapper' class is the type chosen for the generic objects, collections, arrays, lists, dictionaries, etc. that are contained in both, and shared between, the Sort Algorithm Class and 'Core Algorithm' enabling the user inputs, Sort Algorithm Class, 'Core Algorithm' and output methods to integrate with each other.

Claim 25: The InputltemWrapperClass' can contain any single or multidimensional object, type, collection, interface, class, struct, procedure, method, function, property, field, event, enumeration, or anything capable of being hosted in a class, especially user defined types, required to sort and process input or output data.

Claim 26: The InputltemWrapperClass' contains a reference to the 'Inputltem' ' it wraps.

Claim 27: The InputltemWrapperClass' contains an integer property that is the original index of each Inputltem' as it occurs in the ListToSort' enabling the sorting of

multidimensional arrays, tables, and collections of objects according to the column being sorted.

Claim 28: The InputltemWrapperClass' contains an integer property that is the original index of each Inputltem' as it occurs in the ListToSort' enabling sort stability in nonstable Core Algorithms.

Claim 29: The InputltemWrapperClass' can contain an integer, enumeration or any type of property, or properties, that are initialized to zero (or reset) and get incremented (or set) as the InputltemWrapper reaches milestones, specific conditions, or completes an entire cycle in the sort algorithm, providing individualized property tag information for each

Inputltem'.

Claim 30: The InputltemWrapperClass' or Sort Algorithm Class can contain methods to perform type conversions enabling different types of items such as single-precision floating-point numbers, integers, decimals, numbers represented as strings, etc to be sorted without error.

Claim 31 : The InputltemWrapperClass' or Sort Algorithm Class can contain methods to select and create lists of distinct data, ranges of data, and data that meets any user specified criteria creating sorted data results comparable to those created by Structured Query Language.

Claim 32: The InputltemWrapperClass' or Sort Algorithm Class can contain a text encoder and decoder and properties to hold their results enabling sorting of multi length encoded characters, surrogate pairs, and composite characters.

Claim 33: The InputltemWrapperClass' or Sort Algorithm Class can contain a method to convert left to right, right to left, and bi-directional text into a uniform format and a property to hold the result of the method enabling sorting of lists containing a mixture of left to right, right to left, and bi-directional text.

Default Core Algorithm Claims

Claim 34: The Default Core Algorithm contains complex multi-dimensional collections of collections of 'InputltemWrapper' buffer class, 'Dictionary', 'ReverseDictionary', 'CurrentColumnCharacterReport', 'CurrentColumnCharacterPopulationsList', 'RemainingCollections', 'RemainingCollectionsTopBucket', 'SortBucketsCollectlon', and 'SortBucketsTopBucket' as a objects to improve the speed and memory

performance of and provide extended, programmable, data output options to existing sort algorithms.

Claim 35: The Default Core Algorithm optimizes memory resources by using the same memory for 'WrappedListToSort' and 'SortResultList'.

Claim 36: The 'Default Core Algorithm Optimizes memory resources by creating the exact 'SortBucket' count, in the exact sort order, with each 'SortBucket' the exact required capacity for every loop iteration.

Claim 37: The 'Default Core Algorithm Optimizes memory allocation when encountering 'Alien Items'

Claim 38: Each 'Default Core Algorithm' loop tests for and uses the condition of only one 'SortBucket' being created as a method to determine if it will branch to and spend time performing additional sorting procedures.

Claim 39: If only one 'SortBucket' was created it is tested for every item being equal and if they are, they are all moved to the 'SortResultList' and a new sort cycle is started.

Claim 40: If only one 'SortBucket' was created and the items are not equal, it tests for identical characters in all items in the current column and then each successive column, while testing for and placing completely sorted items in the 'SortResultList' as it finds them, until it finds a column with not identical characters which pushes the sorted column of the items as far as possible.

Claim 41 : If only one 'SortBucket' was created additional sort procedures can be performed in order to take advantage of the statistical characteristics of, or types of objects in, the ListToSort'.

Claim 42: In cases where the statistical composition of the ListToSort and

ProgrammingListwexe such that the time required to handle Dictionary KeyNotFound exceptions exceeded the time for checking each characters Dictionary membership status, the Core Algorithm could alternatively test for the presence of each character in the Dictionary before looking them up to avoid KeyNotFound exceptions and the Sort Algorithm Class could provide an class constructor parameter enumeration or property making the choice user programmable.

Claim 43: Core Algorithm' ob\ects can be generic list types if they are tasked with resizing to handle Alien Input Items' but can be array types otherwise. The Sort

Algorithm Class can provide an enumeration or property making the choice user

programmable.

Claim 44: The 'Default Core Algorithm' can act as a drain sort by finding the heavyweight objects or a float sort by finding lightweight objects by changing the direction and order of its 'Default Core Algorithm' objects.

GUI Claims

Claim 45: The Sort Algorithm Class provides public interfaces to its internal objects enabling external programs to use them.

Claim 46: The Sort Algorithm Class has an independent graphical user interface window (GUI) that provides visualization to the algorithm.

Claim 47: The GUI provides complete access to the algorithm without having to write a program, simplifies the software programming problem of creating and validating

'ProgrammingLists' and other algorithm inputs, and contains tools for sort algorithm design and evaluation. It contains dialogs, methods and controls for selecting existing data including a file select dialog, paste, drag and drop selected text, files, and objects from other applications. For creating user supplied data and 'ProgrammingLists' \t contains controls to enter, drag, drop, paste and edit user input data, methods to validate the data, and lists of pre-composed programming scripts. It contains a Sort Algorithm Class run button and controls to operate the Sort Algorithm Class output methods to generate data, a file save dialog for saving data results, and statistical reports. For algorithm design and evaluation it contains test list and test file generators, timers, statistical reports, and output error validation procedures.

Claim 48: The GUI provides a character encoder and decoder for creating code points from characters and characters from code points.

Claim 49: The Sort Algorithm Class can contain or be given collections of Diabolical Characters via a constructor parameter, and contain methods to identify them and a property or enumeration selecting the method used to sort them.

Claim 50: The algorithm is adaptable to FPGA's, integrated circuits, and solid state hardware devices.

Description:
TITLE OF THE INVENTION

Programmable Sort Order Algorithm with Graphical User Interface INVENTOR

Peter Nicholas Mashack

United States Citizen

739 Beachnut Ave.

Simi Valley, Ca. 93065

CROSS-REFERENCES TO RELATED APPLICATIONS

Related U.S. Application Data

Provisional application No. 61 /747,974 Filed on 12/31 /2012

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR

DEVELOPMENT:

Not Applicable.

THE NAMES OF THE PARTIES TO A JOINT RESEARCH AGREEMENT: Not Applicable.

COMPUTER PROGRAM LISTING APPENDIX

The total number of compact discs including duplicates: 2

Files on each compact disc are:

File Name File Date/Time File Size

ReadMe.txt 12/26/2013 03:02 PM 2,142 bytes SkimSort.dll 06/27/2013 1 1 :54 AM 13,824 bytes SkimSortGUI.exe 12/26/2013 07:00 PM 1 13,664 bytes Sort Constructor.txt 12/14/2013 01 :34 PM 45,1 12 bytes

Sort_DataReturnFunctions.txt 2/ 4/20 3 0 :34 PM 13,591 bytes

Sort_lnputltemWrapperClass.txt 16/14/2013 01 :34 PM 1 ,716 bytes

Sort_Properties.txt 16/14/2013 01 :34 PM 1 ,224 bytes

Sort_PublicDictionaryFunctions.txt 12/14/2013 01 :34 PM 1 ,745 bytes

BACKGROUND OF THE INVENTION FIELD OF THE INVENTION:

A programmable sorting algorithm for computers, electronic hardware, and integrated circuits, capable of sorting objects in any possible sort order combination and providing selectable data outputs.

DESCRIPTION OF PRIOR ART:

The classic book 'Art of Computer Programming Sorting and Searching' by Donald E. Knuth, and many other sources, describe a dozen or more existing and well known sort algorithms. United States Patent Classification Subclass 752 Sorting and Ordering Data in Class 707 Data Processing is dedicated to sorting and ordering data. Sort algorithms are usually classified as either a 'Comparative' or 'Distributive' type. Comparative types use a scale function that weighs two items, determines if one is greater than the other or if they are equal, moves or changes their order if necessary, and repeats the process until the items are sorted. Distributive types create multiple classification buckets, like 0 to 9, or A to z, and distribute each item in its corresponding bucket, repeating the process until the items are sorted. United States Patent 4809158 'Sorting Method and Apparatus' by Peter B. McCauley Feb. 28, 1989 provides an excellent example of a distributive type algorithm.

Stability is an important property in sort algorithms. A stable algorithm will not rearrange the order of equal items, an unstable algorithm may. Stability only matters in lists with more than one column. Sort stability can be demonstrated by first sorting complete rows of an entire table with multiple columns by a number column. After the number sort is complete, re-sort again by a name column. The name sort will change the number sort in many cases. If we require that identical names maintain their number sort order after the name sort, the sort must be stable. Each basic algorithm has a multitude of variations and components. Algorithms can contain other algorithms and there is virtually no limit to the possible customization and combinations of the basic algorithms.

BRIEF SUMMARY OF THE INVENTION

Limitations of existing sort algorithms, sort algorithm patents, and sorting methods of computer operating systems, file managers, programming languages, spreadsheet programs, word processing programs, database programs, Structured Query Language, and mathematics application programs are their inability to provide programmable sort order combinations, data filters, and data view combinations, each created on demand, from a single sort algorithm result list, without having to repeat algorithm. They also lack Unicode compatibility, ability to sort items that are not members of their culture or type, and error tolerance.

There is virtually no existing literature on programmable sort algorithms. Internet search engine results of the terms 'programmable sort' or 'programmable sort algorithm' return nothing closely related to the sort algorithm described in this document.

BRIEF DESCRIPTION OF DRAWINGS

This document contains no drawings.

DETAILED DESCRIPTION OF THE INVENTION

The programmable sort algorithm is a class that can provide every possible sort order to any list, array, or collection of any type or types of objects. A class provides a much more powerful and flexible sorting tool than any existing function, method, or procedure by itself. Implementing a sort algorithm as a class enables the algorithm to use objects, types, interfaces, classes, structs, properties, fields, events, and enumerations not available to existing sort algorithms. It enables programmable sort order combinations, data filters, and data view combinations, each created on demand, from a single sort algorithm result list, without having to repeat algorithm. Implementing a sort algorithm as a class enables error handling and reporting not available with existing sort algorithms. Implementing a sort algorithm as a class should no be confused with a class that contains sort procedures, functions and methods. The programmable sort algorithm class will be referred to as 'Sort Algorithm Class'.

The Sort Algorithm Class encapsulates, extends, and improves existing sort algorithms but makes no claim as a new or novel basic sort algorithm. That should not be perceived as weakness. Any existing sort algorithm it encapsulates will be referred to as its 'Core Algorithm'. The Sort Algorithm Class provides its benefits to any Core Algorithm that is integrated into it and together they provide capabilities not available with existing sort algorithms. Statistical mathematics dictates there is no single best sort algorithm. The composition and existing order of the items being sorted determines the fastest or most memory conserving algorithm and that varies.

The Sort Algorithm Class as described in this document has a Core Algorithm that is based on, but an extension of, the existing and well known radix, count, and bucket algorithms in the distributive class of sorting algorithms. It is designed for sorting lists of Unicode encoded strings. It will be referred to as the 'Default Core Algorithm'. The Default Core Algorithm does claim new and novel extensions to the existing and well known algorithms it is based on.

The Sort Algorithm Class contains a parameter in the class constructor that accepts the list, array, collection, or object that programs the algorithm.

The sort order programming constructor parameter will be referred to as the

'Programming List' or 'ProgrammingList'. The sort order can be programmed according to the sequence of objects in the ProgrammingList. The ProgrammingList can be single or multi-dimensional and consist of any file, list, array, or collection of any type of objects. The first object in the ProgrammingList can have the lightest sort weight and the last object the heaviest sort weight or vice versa. The ProgrammingList can provide column selection, property selection, grouping, filtering, and other sort refining commands to perform on the input list to be sorted. Custom objects and classes contained in the

ProgrammingList may require the sort algorithm to be compiled with references to them.

The Sort Algorithm Class can also contain built-in ProgrammingLists that are selected with a ProgrammingList' class constructor parameter enumeration.

The Sort Algorithm Class contains a parameter in the class constructor that accepts the list, array, or collection to be sorted into the Sort Algorithm Class.

This parameter will be referred to as the 'List To Sort' or 'ListToSort'. The Sort Algorithm Class also has a property that is set to, and references, the ListToSort so it can

remember and reference it. The ListToSort can be single or multidimensional and consist of any file, list, array, or collection of any type or types of objects or classes and any list item can contain any type of object. The ListToSort can be sorted according to any of it dimensions if multidimensional, and any of its items properties or fields if the item is a class. Individual items in the ListToSort will be referred to as an Input Item' ox

'Inputltem' and groups of them as 'Input Items' o 'Inputltems'. Custom objects and classes contained in the ListToSort may require the sort algorithm to be compiled with references to them.

The Sort Algorithm Class can also contain built-in ListsToSort that are selected with a ListToSort class constructor parameter enumeration.

Input Items discovered in the ListToSort that are not a member of the

' ProgrammingList' are referred to as Alien Input Items', 'Alien Items', or 'Aliens'.

Alien Input Items can be processed by several different methods and those methods can be selected by the user via a class constructor parameter enumeration.

Alien Input Items can be removed from the sort process as they are discovered and not returned with the sorted results.

Alien Input Items can be assigned a relative sort weight according to the sequence they are discovered and appended to the ProgrammingList' and all objects derived from it. They can be assigned the lightest sort weights placing them at the top of the sorted list or the heaviest sort weights placing them at the bottom of the sorted list.

Alien Input Items' can be collected in a separate list and returned to the user by an output method.

The Sort Algorithm Class contains error handling and reporting enabling it to return valid and complete results in the presence of errors.

The Sort Algorithm Class has enumerations for, tests for, and handles class constructor parameter errors such as null and zero length constructor parameters, duplicate ProgrammingList' items, and provides the user with error feedback via an error property or event.

The Sort Algorithm Class isolates 'Input Items' that throw exceptions into an error list and continues running the algorithm.

The Sort Algorithm Class contains an input item wrapper class.

The Sort Algorithm Class places each item in the ListToSort into a wrapper class designed to buffer the input items and other user supplied object details from the Sort Algorithm Class. The wrapper class is called 'Input Item Wrapper Class' or

'InputltemWrapperClass' or 'Input Item Wrapper' or 'InputltemW rapper'. The entire wrapped ListToSort list will be referred to as the 'Wrapped List To Sort' or

'WrappedListToSort'.

The InputltemWrapper encapsulating each Inputltem is the actual object type that the Sort Algorithm Class and Core Algorithm operate on. The Sort Algorithm Class and Core Algorithm share generic objects of type <lnputltemWrapperClass>.

Some Default Core Algorithm source code examples are:

List<lnputltemWrapper> NullWrapperList;

List<lnputltemWrapper> EmptyTextWrapperList ;

List<lnputltemWrapper> DataWrapperList;

l_inkedl_ist< List<lnputltemWrapper» RemainingCollections;

List< List<lnputltemWrapper» zSortBucketsCollection;

List<lnputltemWrapper> zSingleSortBucket;

List<lnputltemWrapper> zTopBucket;

List<lnputltemWrapper> zTopBucketNotSortedToEnd;

List<lnputltemWrapper> zSingleSortBucket_NotCompletelySortedltems; Individual Inputltems wrapped in an Inputltem WrapperClass will be referred to as a 'Wrapped Item' or 'Wrappedltem' and groups of them as 'Wrapped Items' or

'Wrappedltems'.

The InputltemWrapperClass can contain any set of properties, fields, methods, objects or anything required to buffer the type of Inputltem being sorted from the Sort Algorithm Class including these properties:

1 : A property that is set to and references the Inputltem being wrapped so the Sort Algorithm Class can use and reference each Inputltem.

2: An integer property named 'Originallndex' that is the original index of the Inputltem as it was located in the ListToSort before being processed by the sort class. The sorted original indexes are useful for sorting rows of entire multidimensional arrays, tables, or collections of data where the ListToSort is a member column, equivalent to sorting a spreadsheet according to a selected column, but with a programmable sort. The property is also useful for enabling sort stability in non-stable core algorithms if the property is used in the non-stable algorithm's Comparer Interface or Delegate as the method of determining the comparer result when the Inputltems being compared are otherwise equal.

3: An integer or enumeration property, or properties, that are initialized to zero (or reset) and get incremented (or set) as the InputltemWrapper reaches milestones, specific conditions, or completes an entire cycle in the sort algorithm, providing individualized property tag information for each Inputltem'.

4: Methods to perform type conversions enabling different types of items such as single- precision floating-point numbers, integers, decimals, numbers represented as strings, etc to be sorted without error. 5: Methods to select distinct data, ranges of data, data that meets specific criteria, or user selected criteria and create data lists comparable to those created by Structured Query Language.

6: An integral text encoder and decoder and a property to hold the decoded text element representation of string type Inputitem code points, enabling sorting of multi length encoded characters, surrogate pairs, and composite characters. Using string functions to operate on strings containing multi-length encoded characters will cause the characters to be disassembled and their individual byte components to get sorted separately. Sorting individual decoded 'Unicode Characters' rather than individual encoded bytes enables sorting of multi length encoded text characters that would otherwise not be possible.

7: A method to convert left to right, right to left, and bi-directional text into a uniform format and a property to hold the method result enabling sorting of lists containing a mixture of left to right, right to left, and bi-directional text.

The Sort Algorithm Class contains independent output data lists of selected data types.

The Sort Algorithm Class contains methods that can be user programmable to identify and select data by any criteria, along with corresponding output data lists to store the data and optionally remove it from sort processing and filter from the 'Sort Result List'.

Types and classes of output lists include:

1 : An error list that will contain every Inputitem that throws an exception along with a copy of the exception itself and any necessary additional error reporting.

2: A list that will contain null Inputltems.

3: A list that will contain empty text Inputltems. 4: A list that will contain Alien Input Items'.

5: A list to contain data types like Positive Infiniti, Negative Infiniti, Not A Number, Nullable Booleans set to Null, Maximum Type Values, Minimum Type Values, etc.

6: Lists containing selected data such as distinct input items, input items that meet specific user supplied criteria, data within a range of values, Structured Query Language type data result lists, etc.

7: The sort result list. The sorted Wrapped ListToSort after being processed by the core algorithm, with optionally removed error, null, empty text, aliens or user selected items. This list will be referred to as 'Sort Result List' or 'SortResultList' or 'Data'.

The Sort Algorithm Class contains programmable output components.

Each output data list is a class property and the Sort Algorithm Class is given an

enumeration associated with each data list and every possible output combination of them.

The Sort Algorithm Class contains output methods to return the sorted 'Inputltems' or their original indexes. 'ObjectOutputMethod' returns a sorted list { 'SortedObjectLisf) of 'Inputltem Objects and 'OriginallndexOutputMethod' returns a sorted list

{ 'SortedOriginallndexLisf) of the original indexes of the 'Inputltems' as they occurred in the ListToSort. Both methods share a common input argument enumeration

{ 'ReturnSelectEnum') used select the desired data type, data, and the order of the data.

The methods look like this:

SortedObjectList = ObjectOutputMethod {ReturnSelectEnum),

SortedOriginallndexList = OriginallndexOutputMethod {ReturnSelectEnum), All possible output combinations can be obtained on demand by querying either the ObjectOutputMethod or OriginallndexOutputMethod with the desired enumeration without having to re-run the Sort Algorithm Class.

Several possible output enumerations are:

Nulls

EmptyText

Data

Nulls EmptyText

Nulls Data

EmptyText Nulls

EmptyText Data

Data Nulls

Data EmptyText

Nulls EmptyText Data

Nulls Data EmptyText

EmptyText Nulls Data

EmptyText Data Nulls

Data Nulls EmptyText

Data EmptyText Nulls

Aliens

Errors

Distinctlnputltems

ItemsGreaterThanTen

ItemsWithColorPropertyEqualToRed

ItemsThatContainTheCharacterZ

Any desired data selection method or enumeration can be added to the Sort Algorithm Class providing any possible output data view, filter, or user selected criteria, without having to re-run the algorithm. For speed and memory performance reasons, the Sort Algorithm Class or Core Algorithm may not keep copies of each individual output list per se. Wrappedltem versions of them can get created by the core algorithm and stored in the Sort Algorithm Class.

Wrappedltem versions of output lists get unwrapped, created, and returned to the user when the user calls one of the output methods with an enumeration that requests them. When an output list is returned to an external client program, keeping a local reference copy of the list in the external program may perform better than repeated calls to the Sort Algorithm Class output methods for re-creating the exact same data, depending on the ListToSort size, computer hardware memory size, and other factors

An inventory of the Default Core Algorithm data objects and their description starts here:

The Default Core Algorithm, which can be substituted or modified, is based on, but an extension of, the existing and well known radix, count, and bucket algorithms in the distributive class of sorting algorithms. It is designed to sort lists of Unicode encoded strings in any possible Unicode character sort order.

The Sort Algorithm Class and any core algorithm integrated into it will need to share the classes, objects, properties, collections and methods described above, especially the

InputltemWrapperClass, and will have identical or similar names and descriptions for them.

'ListToSort': The input string list to be sorted.

'ProgrammingList ': The character sequence that programs the sort order

'Sort Result List': Initialized as the WrappedListToSort at the very beginning of construction, then emptied into the Default Core Algorithm on the initial sort loop, then re-collects the completely sorted Wrappedltems as they become available during each sort loop, and finally becoming the output list at the end of the algorithm. Using this object for both the input and output increases the algorithm efficiency by saving the time required for memory allocation because it already has the exact required capacity needed to contain the output data.

Three independent output lists of Wrappedltems are created during construction of the WrappedListToSort

1 : NullWrapperList to hold nulls.

2: EmptyTextWrapperList to hold empty text.

3: WrappedListToSort also known as DataWrapperList to hold every input item except nulls and empty text. This list becomes 'SortResultList' when the algorithm completes.

'Dictionary':

The Dictionary keys are the ProgrammingList characters and the values are the characters positions in the ProgrammingList. It gives the first character on the

ProgrammingList a weight of 0, the second character a weight of 1 , the third character a weight of two... etc. The Dictionary returns the sort order weight of any character in the ProgrammingList that is input to it.

'ReverseDictionary':

The Dictionary in reverse. Returns the character of any sort order weight integer that is input to it.

InputltemWrapper : The class shared with the Sort Algorithm Class that encapsulates and buffers each Inputltem string and integrates the Sort Algorithm Class, Default Core Algorithm, and user data inputs.

The three properties of the Default Core Algorithm's InputltemWrapperClass are: 1 A System. Globalization. Stringlnfo representation of the Inputltem. 2: An integer property that is initialized to zero and points to the character column index that is currently being examined by the Default Core Algorithm and whose character at that index determines the SortBucket (defined below) the InputltemWrapper W be placed in.

3: An integer property that is the original index of the Inputltem as it was located in the ListToSort before being processed by the sort algorithm.

'CurrentColumnCharacterReport':

A single dimension integer list whose indexes correspond to the Dictionary characters sort weights and whose values are the total count of the characters that occur in all

Inputltems ior the character column currently being sorted. It is a census report of all Dictionary characters and their count in the current column about to be sorted. As a

Default Core Algorithm design choice, alien characters are not collected in an Aliens output list, and are appended to the end of CurrentColumnCharacterReport, Dictionary, and ReverseDictionary giving aliens to the heaviest sort weights.

CurrentColumnCharacterReport is a generic list in case it has to be resized to

accommodate aliens but may give better performance as an array if the Sort Algorithm Class is guaranteed not to be given 'Alien Input Items' or is not programmed to collect or return them. The Sort Algorithm Class can provide an enumeration or property making that choice user programmable. As a design choice for performance reasons, the Default Core Algorithm handles dictionary KeyNotFound exceptions for Aliens rather then checking each character to see if it is a member of the Dictionary. In cases where the statistical composition of the ListToSort and ProgrammingListwere such that the time required to handle KeyNotFound exceptions exceeded the time for checking each characters dictionary membership status, the Default Core Algorithm could alternatively test for the presence of each character in the Dictionary before looking them up to avoid the

KeyNotFound exceptions. The Sort Algorithm Class can provide an enumeration or property making that choice user programmable. 'CurrentColumnCharacterPopulationsList':

The CurrentColumnCharacterReportw\Vr\ the empty elements removed. 'CurrentColumnDictionary':

Dictionary for the current column. Exact same as Dictionary but only contains characters that exist in the CurrentColumnCharacterPopulationsList. This dictionary optimizes memory allocation by allowing the exact amount of SortBuckets (defined below) to be created for the current column and by initializing the capacity of each bucket in the

SortBuckets collection (defined below) to the exact required size.

'RemainingCollections'

A complex multidimensional object consisting of a linked list collection of partially sorted InputltemWrapper generic list collections. A collection in the RemainingCollections is called a 'Bucket'. The top most collection is sorted by the most character columns and contains the lightest sort weight items with the exception of the completely sorted items in the Sort Result List. Items in each collection weigh less then items in the collection below them. Items in each collection are sorted to the same column. Items in each collection may or may not be in the correct sort order. The top most collection called TopBucket' or

'RemainingCollectionsTopBucket' \s selected to fill the 'SortBucketsCollection'

(defined below) at the start each core algorithm sort cycle. A linked list was chosen because it is an efficient memory data structure for the inserting and removing Buckets at the top of it.

'SortBucketsCollection' or 'SortBuckets' or 'zSortBucketsCollection' (in source code): A complex multidimensional object consisting of a generic list collection of

'InputltemWrapper' generic list collections. A collection in the SortBucketsCollection is called a 'SortBucket'. The top most collection is called TopSortBucket' or

'SortBucketsCollectionsTopBucket'. 'SortBucketsCollection ' is created for each sort loop, one SortBucket for each 'CurrentColumnDictionary' ' item. The RemainingCollectionsTopBucket gets distributed into SortBuckets according to each items character sort weight in the column being sorted.

At the most basic simplified level the relationship of the objects is as follows:

Create the Dictionary and ReverseDictionary from the Programming List.

Step through the ListToSort and wrap all its Inputltems in InputltemWrappers. If a Wrappedltem contains null or empty text place it in the Sort Algorithm Class

NullWrapperList or EmptyTextWrapperList output lists, otherwise place it in the

WrappedListToSort.

Create a CurrentColumnCharacterReport and

CurrentColumnCharacterPopulationsList, for column zero of all wrapped strings in WrappedListToSort and use them to create the SortBucketsCollection.

Distribute the WrappedListToSort into the SortBucketsCollection thereby sorting and grouping the WrappedListToSort strings by their first character. Transfer the

SortBucketsCollection to the RemainingCollections.

Remove the RemainingCollections TopBucket and examine it for completely sorted items. Completely sorted items are identified by being the only 'Wrappedltem' \n the TopSortBucket, or by virtue of their length and the algorithm having tested every character to their last character. Move completely sorted items to the Sort Result List.

Create a CurrentColumnCharacterReport and

CurrentColumnCharacterPopulationsList for the next column of all wrapped strings from the RemainingCollections TopBucket and use them to create the

SortBucketsCollection. Distribute the remaining items from the RemainingCollections TopBucket into the

SortBucketsCollection thereby sorting and grouping them by their next character.

Place the SortBucketsCollection at the top of the RemainingCollections. The

TopSortBucket becomes the TopBucket and the cycle repeats until there are no more items to sort.

Except... If there was only one sort bucket created, run the following anti-worst case procedures:

Test all items to see if they are identical and if they are, place them all in the Sort Result List.

If they are not identical, test for identical characters in all items in the current column and then each successive column, while testing for and placing completely sorted items in the 'SortResultList' as they are discovered, until a column with not identical characters is found which pushes the sorted column of the items as far as possible for a single algorithm loop.

Place the SortBucket at the top of the RemainingCollections if it has remaining items. The TopSortBucket becomes the TopBucket and the cycle repeats until there are no more items to sort.

Inventory of Default Core Algorithm data objects and their description ends here:

The 'Default Core Algorithm' can be designed to operate in the opposite direction, acting as a drain sort by finding the heavyweight objects, changing the direction and order of its 'Default Core Algorithm Objects and placing completely sorted items at the top of

SortResultList as they are discovered. Do not confuse this concept with selecting an ascending or descending sort. The Programming List and sort order remain the same but the algorithm focuses on the heaviest rather than the lightest items and the performance of the algorithm may vary depending on the statistical properties of the ListToSort.

Default Core Algorithm pseudo source code starts here:

Sort Algorithm Class construction is initialized with the two required external inputs 'ListTo Sort' and 'Programming List'.

The following data objects are initialized:

Programming List is used to create the 'Dictionary' and 'ReverseDictionary'.

'CurrentColumnCharacterReport' is created and initialized to all zeros.

'ReturnList' is created.

'RemainingCollections' is created.

The initial sort loop steps through each item in the List To Sort and performs the following:

Constructs an instance of an 'InputltemWrapper' class using each Input Item.

After each item is wrapped,

It is placed in the 'SortedNullData' list if it is null.

It is placed in the 'SortedEmptyTextData' list if it is empty text.

If it is Data, inspect each items first character and update 'CurrentColumnCharacterReport' according to each character.

If the character not in the 'Dictionary':

1 : Add it to the end of the Dictionary.

2: Add it to the ReverseDictionary.

3: Append the character to the CurrentColumnCharacterReport. Add the Data item to 'RemainingCollections' list.

After the initial sort loop has completed, the CurrentColumnCharacterReport is complete.

Use the CurrentColumnCharacterReport to create the 'CurrentColumnDictionary' and 'CurrentColumnCharacterPopulationsList' for the Current Column.

Create a new 'SortBucketsCollection' with the capacity of 'CurrentColumnDictionary' count to sort all items in the 'RemainingCollections' into.

Create the individual SortBuckets with each SortBucket capacity according to its corresponding characters 'CurrentColumnCharacterPopulationsList' population count.

Step through each 'InputltemWrapper' in the 'RemainingCollections' and place

it in the correct SortBucket using the following steps:

For each item in the 'RemainingCollections':

Examine the 'SortedToColumnlndex'.

Examine the Character at the 'SortedToColumnlndex' of 'Stringlnfo'.

Look up the Character's SortBucket in the 'CurrentColumnDictionary'.

Place the item in the correct SortBucket .

Move the 'SortBucketsCollection' to 'RemainingCollections' maintaining the bucket order.

The algorithm is initialized.

Create a list called 'TopMostBucketNotSortedToEnd' for separating and storing

'InputltemWrappers' that have not been completely sorted.

Repeat the following steps until the 'RemainingCollections' is empty: 'Algorithm Main Loop Starts Here:'

Get a reference to the topmost bucket in the 'RemainingCollections' named

TopMostBucket', then remove it from 'RemainingCollections'.

Test the 'TopMostBucket for one 'InputltemWrapper' and if true - move the

'InputltemWrapper' to the 'ReturnList' and go back to

'Algorithm Main Loop Starts Here:'

When a Bucket with more than one item is found:

Initialize the 'CurrentColumnCharacterReport' with all zeros.

Clear all 'InputltemWrappers' if any from 'TopMostBucketNotSortedToEnd'.

Step through 'TopMostBucket' and for each 'InputltemWrapper':

Increment the column.

Examine the 'SortedToColumn Index' and 'Stringlnfo' length to

determine if 'InputltemWrapper' is completely sorted.

If it is, place the 'InputltemWrapper' in 'SortedData' list.

Else do the following:

Place the 'InputltemWrapper' in 'TopMostBucketNotSortedToEnd' list.

Increment each 'InputltemWrapper' SortedToColumnlndex".

Inspect each 'InputltemWrapper' SortedToColumnlndex character and update 'CurrentColumnCharacterReport' according to each character.

If the character not in the Dictionary:

1 : Add it to the end of the Dictionary.

2: Add it to the ReverseDictionary.

3: Append the Character to 'CurrentColumnCharacterReport' After stepping through each 'InputltemWrapper' in the TopMostBucket' if

there are no items in 'TopMostBucketNotSortedToEnd' go back to

'Algorithm Main Loop Starts Here:'

If there is one item in 'ToplVlostBucketNotSortedToEnd', place the 'InputltemWrapper' in 'SortedData' list and go back to

'Algorithm Main Loop Starts Here:'

Else do the following:

Use the CurrentColumnCharacterReport to create the 'CurrentColumnDictionary' and 'CurrentColumnCharacterPopulationsList' for the Current Column.

Create a new 'SortBucketsCollection' with the capacity of

'CurrentColumnDictionary' count to sort all items in the

'RemainingCollections' into.

Create the individual SortBuckets with each SortBucket capacity according to its corresponding characters 'CurrentColumnCharacterPopulationsList' population count.

Step through each 'InputltemWrapper' in the 'TopMostBucketNotSortedToEnd' and place it in the correct SortBucket using the following steps:

For each item in the 'TopMostBucketNotSortedToEnd':

Examine the 'SortedToColumnlndex'.

Examine the Character at the 'SortedToColumnlndex' of 'Stringlnfo'.

Look up the Character's SortBucket in the 'CurrentColumnDictionary'.

Place the item in the correct SortBucket

All 'TopMostBucketNotSortedToEnd' 'InputltemWrappers' are now in their

correct SortBucket. If there is more than one SortBucket: Move the 'SortBucketsCollection' to 'RemainingCollections' maintaining the SortBucket order, then go back to

'Algorithm Main Loop Starts Here:'.

Else if there is a single sort bucket (aka SingleSortBucket), run the

'SingleSortBucket Anti Worst Case Routines'.

Test if all 'InputltemWrappers' in SingleSortBucket have identical

'Stringlnfo. String' properties. If all identical, place them all in

'SortedData' list, then go back to 'Algorithm Main Loop Starts Here:'.

If not identical do the following:

Create a Bucket named 'NotCompletelySortedBucket' to temporarily store separated 'Not CompletelySorted' from 'CompletelySorted'

'InputltemWrappers' for each loop iteration.

The temporary 'NotCompletelySortedBucket' frees us from reallocating memory for each 'CompletelySorted' 'InputltemWrapper' that is discovered.

Increment SortedToColumnlndex Loop Starts Here: while the SingleSortBucket contains any items

{

Empty the 'NotCompletelySortedBucket'

Examine each 'InputltemWrapper' in the SingleSortBucket,

Place 'CompletelySorted' Items in 'SortedData' list.

Place not 'CompletelySorted' Items in 'NotCompletelySortedBucket'.

If there is < 2 items in SingleSortBucket it has been completely sorted.

If there is one item return it then go back to 'Algorithm Main Loop Starts Here:'.

Compare each 'Character' in column SortedToColumnlndex + 1 for all items

in the bucket to see if they are all identical.

Stop the Skim and goto SkimlsDone: on the first non-identical grapheme

found in any column. for (feSingleSortBucketltemlndex = 0; feSingleSortBucketltemlndex <

zRemainingNotReturnedltems. Count - 1 ; feSingleSortBucketltemlndex++)

{

if (!(zRemainingNotReturnedltems[feSingleSortBucketltemlndex].

Stringlnfo.SubstringByTextElements(zGraphemeTestColumn, 1 )). Equals

(zRemainingNotReturnedltems[feSingleSortBucketltemlndex + 1 ].

Stringlnfo.SubstringByTextElements(zGraphemeTestColumn, 1 ),

System.StringComparison. Ordinal))

{

If a different 'Character' is found in the column

goto SkimlsDone; //Skim is done

}

}

//If we got here, all graphemes in this column are equal.

//Increment the SortedToColumnlndex for all remaining items

for (feSingleSortBucketltemlndex = 0;

feSingleSortBucketltemlndex < RemainingNotReturnedltems. Count; feSingleSortBucketltemlndex++)

{

zRemainingNotReturnedltems[feSingleSortBucketltemlndex].

SortedToColumnlndex++;

} //Swap the buckets to reuse in the loop

zSingleSortBucket.Clear();

zSingleSortBucket.AddRange(zRemainingNotReturnedltems);

goto Increment SortedToColumnlndex Loop Starts Here:

Inspect the "Character' in column SortedToColumnlndex +1 for each

'InputltemWrapper' in the SortBucket to see if they are identical.

If a column tests true for all identical Characters, Test all items in

the TopMostCollection for new Sorted To End Items and return them(if any) as they are found

or

On the first non-identical grapheme found in a column,

stop the Skim and goto SkimlsDone:

SkimlsDone: // Returned remaining not Skimmed items to RemainingLinkedLists zRemainingCollections.AddFirst(zRemainingNotReturnedltems);

go back to 'Algorithm Main Loop Starts Here:'.

Default Core Algorithm simplified source pseudocode ends here:

Graphical user interface starts here:

The Sort Algorithm Class provides public interfaces to its internal objects enabling external programs to use them.

The Sort Algorithm Class has an independent graphical user interface window providing the input and output interfaces necessary use the Sort Algorithm Class without writing a program, enables creating and validating Sort Algorithm Class input programming objects, and provides tools for sort algorithm design, testing and evaluation.

The graphical user interface window reduces the time and effort required to create a programming input list. A comprehensive programming character list for an English language character custom sort will require 100 or more characters. Ensuring a

programming list of that size has all the required characters, no unwanted characters, is in the correct order, contains the space character, and is without duplicate characters is an error prone and time consuming task. Finding and adding characters from another culture or foreign currency symbols can further increase the difficulty.

Graphical controls, methods and procedures include:

File select dialog.

Paste, drag, and drop selected text, files, and objects from another application.

List and file generators for creating test files.

Controls to enter, drag, drop, paste and edit input data.

Controls for creating 'ProgrammingLists' and Sort Algorithm Class input data.

Methods for validating 'ProgrammingLists' and Sort Algorithm Class input data.

Lists of pre-composed programming scripts.

A Sort Algorithm Class run button.

Graphical user controls to operate Sort Algorithm Class output methods to generate data. File save dialog for saving data results.

Output error validation procedures, timers, and statistical reports. A character encoder and decoder for creating code points from characters and characters from code points.

The graphical user interface contains a Unicode encoder and decoder useful for examining and creating Unicode characters, code points and identifying 'Diabolical Characters'. Diabolical Characters exist in two types. The first type are examples like single or double quotes that have more than one Unicode representation, each having unique code points and glyphs but they look confusingly similar. It takes a close examination of them to notice the differences. Copying and pasting characters from HTML web pages can introduce them into text. The second type is composite characters with identical glyphs and code points but the code points are not in the same order. The characters are identical and indistinguishable when viewed as text but will be sorted separately by sort algorithms that do not have a method of identifying and grouping them. When the list is viewed as a sorted text list it will appear as if the algorithm failed because identical glyphs got sorted separately according to their code point order.

Sorting Diabolical Characters separately may or may not be desirable to the user of the Sort Algorithm Class. If it is undesirable, the algorithm will be required to have collections of Diabolical Character groups, each with a common group sort weight, examine each character to identify if it is a member of a Diabolical Character group and if it is, assign the common group sort weight to it. Providing the class constructor with a Diabolical Character sort method enumeration makes the Diabolical Character sort behavior programmable. The class can be given collections of Diabolical Characters via a class constructor parameter or have built in collections.

Graphical user interface ends here:

The composition of objects and methods of Sort Algorithm Class make it adaptable to electronic system solutions, FPGA's, integrated circuits, and solid state hardware devices.