Parallel Tree Order Finder
Data placed in tree nodes are aggregated according to the structure of the tree to which they belong using tree accumulation. Aggregation can be as simple as an arithmetic operation or as complex as a function, depending on the function used. An arithmetic expression evaluation is similar to this process, except that only one operation on the entire tree is required, and no arithmetic operations are included in internal nodes. A parallel accumulation programme has been written to implement tree data structures.
Our code was developed and tested on fourteen Sun SMPs.
SIMPLE threaded library was used to implement parallelism on the processors.
In this report, we present our findings and describe our work.
In general, parallel implementation of tree collection and other algorithms
requires the input to be organised and preprocessed in some way, such as entering a tree in post
because of the order or preorder of its nodes These
Activities are just as important as the main task at hand because if not done properly, they become performance bottlenecks. Some
Writers throw in the towel and assume that
After “pre-processing,” data has a specific property. In this article, we go over these topics in depth.
report.
TreeGen is a small programme that generates random trees for use in our runs. A random tree is one in which each node contains an arbitrary value and has an arbitrary (up to a specified maximum degree) number of child nodes. The nodes’ arbitrary values can be used later in tree accumulation or other operations. In the figure below, we only show the node numbers. The tour is unconcerned about the node weights or values. Figure-1 shows an example input tree and the output file from treeGen that specifies it. The DIMACS file format is simply a list
To obtain an Euler tour of the undirected tree that we will be working on
After performing the accumulation, we convert this tree into a directed graph.
first, replace each edge of the tree with two opposing arcs
in the opposite direction This produces an Eulerian graph because every node in
The graph will have an even degree distribution. Figure 3 depicts the input tree of
Figure 1 has been transformed in this way.
So far we have organized our information by mail. This allows data to be shared between processors in a way that is necessary to assemble the tree, which is the next step. To collect the tree , we identify the leaf nodes in the order following the tree and share them among the processors, where each processor holds references to the part of the common table that contains the order of that post . To do top-down tree collection in parallel, each processor starts at its own set of leaf nodes and collects data at each node with its parent, and then marks the subgroup as deleted. The number of children of the parent is reduced and the parent is marked as a new leaf node to be processed in the next iteration if its child number is zero. Two tables are maintained, sheets0 and sheets1. Leaves0 holds the current leaves and left1 holds leaves for future iterations. Array pointers are swapped after each iteration. Repeating this process and going up the tree until reaching the root performs an upward stack.
Below is the sample of the code performing the above task.
Another consideration is that runtimes are dominated.
by the time spent constructing the Euler chain and computing
As a result, these processes should not be overlooked.
In this report, we demonstrated how preprocessing can and should be done concurrently, as well as how it can be used concurrently in the main task.
The choice of data structures is, as always, based on the operations we want to perform on the data. We made use of the attached
Reading the DIMACS tree file into memory is the first thing our programme does.
the tree in the adjacency list shown in figure 3 after storage. the
Afterward, the programme runs a parallel process to discover the Euler tour.
the subsequent code:
Take note of how the edgeList value has been subtracted in the second line. Here is
Why did we mention that the memory location or index for each edge is
It is crucial because it matches the number of that branch’s edge in the tree.Z
To perform parallel Euler rotations and reorderings, use adjacency lists. Following that,
This column order was used to construct a parallel tree. We hope this helps.
The technical report will aid in the implementation of these features in the
future.
Another factor to consider is that runtimes are dominated.
by the time spent constructing the Euler chain and computing the
As a result, these procedures should not be overlooked.
In this report, we demonstrated how preprocessing can and should be done in parallel, as well as how it can be used in parallel in the main task.
The choice of data structures is, as always, strictly related to the operations we want to perform on the data. We used the attached
Adjacency lists are used to perform parallel Euler rotations and reorderings. We then
This column order was used to create a parallel tree. We are hopeful.
This technical report will aid in the implementation of these features in the
future.
An Euler tour is a traversal of a graph such that each edge of its graph is visited exactly once. The Euler cycle technique is required to compute different graphs in parallel. For example, computing the backorder of a tree , which is equivalent to performing a depth-first search, is sequential in nature because the processor cannot determine the value or order of its nodes without waiting for the output of the previous . processor to learn the initial value of this first node. The starting order depends on the nodes that appear before the depth first search. Post-ordering and other functions are achieved by the Euler tour technique without depth-first searching. c tree file generated by c treeGen p format 8 7 n 0 53 e 0 1 e 0 2 n 1 1 e 2 2 2 2 E 3 7 N 31 N 31 N 5 85 N 6 30 N 7 11 6 7 5 ? First, we need this tree by replacing each edge of the tree with two opposite arcs. This results in an Euler graph since every node is even.
Figure 3 shows the input tree of template transformed in this way. Roundabout Result: The result of a Euler circle is a follower containing elements equal to twice the number of edges in the input tree (since each edge is replaced by two directed arcs for a Euler circle). Each cell in this following table represents an edge (index is the edge number) that points to the next edge in the circle. For example, if follower[2] = 7, then edge number 7 follows edge number 2 with an Euler tour. The sequence table in Figure represents the cycle of Euler edges: 1- -9–13–10–1 -8–5–11–3- 2–7–12–6.
Data Structures: Graphics are usually represented by a neighbor list data structure. We extend the typical adjacency list structure to facilitate parallel computation of successor arrays (Euler rotation). For this purpose, we are making three changes. The first two were proposed by [3]. The changes to the adjacency list are as follows: 1. Make the adjacency list circular 2. Add flip pointers aÆb to each edge this pointer points to the flip edge of the graph, edge bÆa. 3. The edge list is contiguous and the position of each edge in it uniquely identifies that edge. We found that this condition is necessary for the algorithm to work. So the structure of one edge looks like this: nextR is the reverse pointer and val is the second node of the edge. If “val” is node “c” and nextR leads to an edge whose “val” is value “a”, then we know that this node represents edge aÆc and its index in the edgeList[] table (see Figure 3 ). ) is the input tree edge number . This is important because a trace table with an Euler turn of refers to a tour with edge numbers. The next pointer simply points to the next edge of the list and becomes circular. next struct edge{ int val; struct edge * next; struct edge * nextR; } Figure-2 nextR val Our program starts by reading the DIMACS tree file into memory and saving the tree in the adjacent list shown in Figure-3. The program then executes the Euler loop in parallel, executing the following code: Notice the decrease in the edge list value on the second line. That’s why we mentioned that the index or location of each edge in memory has meaning, it corresponds to the number of that edge in the tree. Subsequent array elements have two pointers, the successor value trivially points to the next edge. And the prefix value, which can contain the weights assigned to some edges. This code is abstracted according to the algorithm in Figure 3 without the prefix line. The prefix or weight of each node is reset to zero. We need a prefix value of for each edge so that we can sort after the outing if it is found . Getting the column order of a circle: We assign a value of 1 to edges that point up and a value of 0 to edges that point down down We then run parallel prefix sums on trees. This results in a reordering of the nodes. If preordering was desired, it could be obtained by routing the 1s and 0s from the ‘s rising edges to the ‘s falling edges and then performing parallel prefix sums on the edges. Parallel prefix sums are provided by David Helman and are an implementation of the parallel prefix sum algorithm in [3]. The following code summarizes the computation in a post table given that contains the correct Euler rotation and prefix sums in the prefix field: This block of code results in a post[] table containing a list of the nodes of the tree in post. order On the next page, we show an illustration of the data on structures discussed so far, which are needed for the Euler cycle and to obtain the post-sequence.