|
Definition: An ARRAY is a ordered set of a specific, finite number of same-type nodes, whose structural properties involve the relative positions of the nodes. The nodes are randomly (or directly) accessible by a numeric index. The addresses of array members are usually contiguous in the virtual storage of the computer. Order is not of great importance in an Array. There are two basic operations for accessing an Array: extraction and storing. You populate an Array by storing into the first available free slot. To query or access a stored element you directly address its slot location. To store or extract an element from the middle of a populated Array requires that all elements on one side of the insertion/extraction point are shifted by one slot to either make room or to contract the space needed to accomodate the change.
Boundary Conditions include
the following:
To access the pseudo-code source code: arraytest.java, arraytest.cpp, |
|
Definition: A STACK is a linear list or array with the constraint that all insertions and deletions (and queries) are made at a single end of the list, usually known as the TOP of the stack. The order in which elements can be retrieved from a stack is Last-In First-Out (LIFO) or First-In-Last-Out (FILO). Inserting a data item on the top of the stack is called "pushing" into the stack, while removing an item from the top is called "popping" off the stack. Examining the contents of the top slot of the stack is called "peeking" or "querying the top". The Stack can be thought of as an Array with certain constraints place upon access to its elements. While any of an Array's nodes can be directly accessed, you can only "see" the TOP node of the Stack. Note: To save some coding effort you can make Stack an extension of Array. Types of Stacks Simple Stack Dynamic Stack Linked Stack Dynamic Linked Stack
To access the pseudo-code source code: NodeElement, |
|
Definition: An algorithm or method is RECURSIVE
if it defines a procedure in terms of a simpler
case of itself. For a recursive definition to work, the definition in any given case must be well-founded, avoiding an infinite regress. That is: 1) Every recursive call must simplify the computation in some way. 2) There must be special termination condition or stopper cases to handle the simplest computations directly, without the need to call an even simpler case of itself. Sometimes the term inductive definition is used as a synonymn. A recursive call can result in many further recursive calls because the each call is dividing the problem in new smaller subproblems, or smaller instances of the same problem. For a recursive method to terminate, the problem must eventually be reduced to a stopping case. For at least one argument, a recursive function "f" must be defined in terms that do not involve "f". There must be a "way out" of the sequence of recursive calls. When reached, the method returns a result to its immediate caller. This caller, in turn, performs a computation and returns a result to its own caller. This process continues until a result is percolated back up the calling chain to the original caller. The difference between a circular definition and a recursive definition is that a recursive definition must have base cases. Base cases are those that satisfy the definition without being defined in terms of the definition itself, and all other cases comprising the definition must be "smaller" (closer to those base cases that terminate the recursion). Infinite recursion happens either because the parameter values passed during a recursive call do not get simpler or because a special terminating "stopper" case is missing or never reached. Recursive function
call implementation requirements: Each call to the recursive function causes a new variable data area to be allocated and each reference to a variable in the function's data area is local to the most recently allocated area. Each return causes the current data area to be released, and the data area allocated immediately prior to the current area becomes the new current data area. Recursion vs. Iteration The difference between a recursive function and an iterative function is that the iterative type calls for the explicit repetition of some process until a certain condition has been met.Recursion is generally more expensive in computing resources than non-recursive iterative methods. The run environment must save and recover all local and temporary variables and the "return control" locations prior to and upon return from each recursion instance. A stack is generally used for this purpose. The costs for calls to push, pop and peek as well as handling of boundary conditions could get expensive. In a non-recursive procedure the stacking of unnecessary variables can be eliminated. However, from a programming point of view, the recursive solution may be the most natural and logical way of solving a problem.
To access the pseudo-code source code: Factorial |
|
Definition: ALLOCATION is the acquiring of storage to accommodate the data structures and supporting code of a given task. DEALLOCATION is the releasing of previously allocated virtual storage.
Queues A QUEUE is a linear list or array with the constraint that all insertions are made at one end of the list, and all deletions are made from the other end. Data is processed First-In-First-Out (FIFO). Like a stack, a queue can be thought of as an array with certain constraints placed upon access to its member elements.
BOUNDED QUEUES which are of fixed length such as the "pipe queue" and "circular queue" and UNBOUNDED QUEUES which can expand or contract, such as the "linked queue".
A PRIORITY
QUEUE has all the attributes of the standard QUEUE
with the added capability that items can be inserted into the structure
at any position (prioritizing).
LISTS A LIST is an ordered array. Order is usually imposed on the array by linkages to another node of the array. Lists differ from arrays in that they are not necessarily arranged in contiguous storage locations. Lists have a disadvantage compared to Arrays in that to access an element stored in a List you must traverse the structure by following the linkages; whereas Arrays permit direct random addressability. Lists have an advantage over Arrays when inserting or removing a node from other than the structure's endpoints. Only a simple adjustment of the linkage pointers is required. A RECORD is a set of related fields. In JAVA and NetRexx a record
is defined by an Object. Both the JAVA Object and C++ Structure contains non-executable data. A LINKED
LIST is an ordered array that is logically
connected where each record is linked to the next by a single pointer.
// C++ Code Skeleton (HINTS) To access the C++ source skeleton code file: Click Here
|
|
Definition: A TREE structure allows us to organize data hierarchically. Real-world trees occur in many places: geneology, sports tournaments, corporate ornanizations, operating system file structures, etc. Since trees are structurally more complex than lists, they have a more extensive nomenclature for referring to their various components. Root, nodes, leaves, branches. No node can be its own ancestor or descendant. Two nodes may not have the same child. Binary Trees A BINARY TREE is a finite set of single-linked elements that is either empty or partitioned into at most 3 disjoint subsets. The first subset contains a single element called the ROOT of the tree. The other subsets are called the left and right subtrees of the original tree. These subtrees are in themselves binary trees. A tree NODE
is a single element of the tree.
A node with no children is called a LEAF.
The LEVEL
or HEIGHT of
a node in a tree is the node's distance from the root. The series of nodes connecting an ancestor to a descendant is called a PATH. Its length is known as the PATHLENGTH. Definition of an Ordered Tree A tree is said to be inorder if the following two requirements hold: (A) Its left subtree contains
only those values that are less than (or =) its root value. Visiting the nodes of a tree in a specific order is called a TRAVERSAL. PRE-ORDER TRAVERSAL (also known as depth-first order) visits the root, then the left subtree in preorder, then the right subtree in preorder. IN-ORDER TRAVERSAL (also known as symmetric order) visits the left subtree in inorder, then the root, then the right subtree in inorder. POST-ORDER TRAVERSAL(also known as postorder) visits the left subtree in postorder, then the right subtree in postorder, followed by the root.
A THREADED Tree is one constructed so that the right links of all leaf nodes point to the next node required to be visited during an in-order traversal. This provides a "short cut" in traversing the tree. Programming demands for storage and computation are reduced.
|
|
Definition: A FIELD is a set of N bytes that represents some abstract information. A RECORD
is a collection of N related fields. A DATABASE
is a collection of N related files. A merge is the process of combining two or more sorted files into a resultant sorted file.
Exponential order sorts grow more rapidly in the processing time required with the increase in the number of items sorted (N). That is as the number of items to be sorted increased linearly, the time to perform the sort increases exponentially. Orders of Functions The common sorting algorithms can be divided into two classes by the complexity of their algorithms. Algorithmic complexity is a complex subject, suffice it to say that there's a direct correlation between the complexity of an algorithm and its relative efficiency. Algorithmic complexity is generally written in a form known as Big-O notation, where the O represents the complexity of the algorithm and a value n represents the size of the set that the algorithm is run against. Big-O notation is frequently used to describe a sort algorithm's use of computational resources usually with regard to consumed CPU cycles and wall-clock time.
For example, O(n) means that an algorithm has a linear complexity. In other words, it takes ten times longer to operate on a set of 100 items than it does on a set of 10 items (10 * 10 = 100). Complexity of O(log n)
means that an algorithm has logarithmic
complexity. It takes the two times longer to operate on a set
of
1000 items vs a set of 100 items. This is extremely efficient. The complexity was O(n**2) (quadratic complexity with exponential order), then it would take 100 times longer to operate on a set of 100 items than it does on a set of 10 items. The classes of strongly polynomial algorithms which run in O(n**2) and less, include the bubble, insertion, selection, and shell sorts; If the sort is on order of O(c**n) this has geometric complexity. If the sort is on order of O(n!) this has factorial complexity. Weakly polynomial would include any sorts which run in O(nk) time and O(n log n) (polynomial order) which includes the heap, merge and quick sorts.
A Bubble Sort passes through the file sequentially several times. Each pass consists of comparing each element in the file with its successor neighbor ( x[i] with x[i +1] ) and interchanging the two elements if they are not in the proper order. Note that after each pass or iteration, the (nth - i) element is in the proper order, so you can improve the sort time by reducing the scans to the remaining (n - i - 1) items. The bubble sort is generally considered to be the most inefficient sorting algorithm in common usage. Under best-case conditions (the list is already sorted), the bubble sort can approach a constant O(n) level of complexity. General-case is an abysmal O(n2). A Bi-directional Bubble Sort improves on the Bubble Sort by alternating the direction of successive passes through the file between the top and the bottom. In this way the smaller elements move quickly to the top of the order in the same way that the larger elements move to the bottom of the file. Every successive pass compares the remaining unsorted items of the previous pass. A Selection Sort works by selecting the smallest unsorted item remaining in the list, and then swapping it with the item in the next position to be filled. The selection sort has a complexity of O(n2). A relatively slow sort.
The Quick Sort, also called the partition exchange sort, first provides a partitioning method for the Ath element (called the pivot) to seek its final position J in the partition. At this point elements in positions 0 through J-1 are less than or equal to A; and all elements in positions J+1 through N are greater than or equal to A. The process is recursively reapplied on both newly formed left and right partitions until all partitions are of size 1. Now the file is sorted.
Now recursively sort the new partition to the left of the pivot, and then sort the new partition to the right of the pivot (not including the pivot in either sort). The efficiency of the algorithm is impacted by which element is choosen as the pivot point. The worst-case efficiency of the quick sort, O(n2/2), occurs when the list is sorted and the left-most element is chosen. Randomly choosing a pivot point rather than using the left-most element is recommended if the data to be sorted isn't random. As long as the pivot point is chosen randomly, the quick sort has an algorithmic complexity of O(n log n).
The Heap Sort is an implementation of the general selection sort using the input array as a heap representing a descending priority queue. The preprocessing phase creates a heap of size N using the siftup operation, and the selecton phase redistributes the elements of the heap in order as it deletes elements from the priority queue using the siftdown operation. The Heap Sort has a complexity of O(n log n). In the average case, the heapsort is not as efficient as the quicksort. The Heap Sort requires twice as much time as Quicksort for random input.
INSERTION SORTS are similar to Exchange Sorts, but the essence of their algorithms insert into an existing sorted file. A Simple Insertion Sort is one that sorts a set of records by inserting records into an existing sorted file. Such as you would sort a hand of cards. Like the Bubble Sort, the Insertion Sort has a complexity of O(n2). Although it has the same complexity, the insertion sort is a little over twice as efficient as the bubble sort. A Shell Sort sorts separate subfiles of the original file. These subfiles contain every Kth element of the original file. Each subfile is sorted by simple insertion, a new smaller K is chosen, and the file is again partitioned into a new set of subfiles and sorted again. This process is repeated with smaller values for K until K is set to 1. At this point the entire file is sorted. Invented by Donald Shell in 1959, the shell sort is the most efficient of the O(n2) class of sorting algorithms. Of course, the shell sort is also the most complex of the O(n2) algorithms.
MERGE SORTS involve combining two or more sorted files into a third sorted file. The merge sort uses a technique to sort a file by dividing the file into N subfiles and merge adjacent pairs of files. The process is repeated until there is only a single file remaining of size N. There are no more than O(n log n) comparisons in the mergesort involving less than N comparisons, but is does on average require twice as many assignments as quicksort.
To view Sort Algorithm demonstration Click Here
|
|
Definition: The HASH Table is an associative array data structure that relates keys with values. The primary operation it supports is to efficiently and rapidly provide store and retrieval/lookup functions. It is employed where speed of the access and quick retrieval of data is of primary importance. The data structure elements are divided into (preferably) equal-sized categories, or buckets, to allow quick access to the elements. The hash function determines in which bucket an element belongs.
Thus the second part of the hashing search (or storage) requires a collision-resolution process.
Hash Key Computation The arithmetic computation is usually simple to implement, but you must use caution to avoid various subtle pitfalls. If the resultant hash table is to hold N number of items or keys then we must ensure that our hashing computation results in a number between [0, N-1]. The ideal hashing function should evenly distribute its output keys across the span of the table. Avoid clumpiness due to collisions, and gaps. You must be familiar with the data to be able to generate a suitable hashing funtion and to predict the table size needed. Hashing provides a way to use a reasonable amount of both memory and computation time to strike a balance between the consumption of these two resources.
Pseudocode for sample LinkListds linklistds Pseudocode for sample String Hashing Calculation hashstringcalc 100 Random integers (1-1000) for input to lab exersize Click Here |
[Return to Professor Page]