CIS 501
Spring 2003
Assignment 02p

Introductory Assignment – Something to get you started

  1. Prepare an input file "fileA.dat" which contains the input data. Test your program with the data given in Assignment Number 3 (and to do the trace I asked you to do in class today).

  2. Write a FUNCTION that will program read the input data and distribute these data into two files: "fileC.dat" and "fileD.dat". When this is done, the two files should be padded (use the value 999) so that the number of data in both files is the same and is a power of 2. Here is one way to one way to do this.
    1. As data is being distributed, count the number of data items distributed to each file and store these counts in:
      • count_c for "fileC.dat".
      • count_d for "fileD.dat"
      Note that these values may or may not be the same. If not the same, they should differ from each other by at most one.

    2. Next compute two values named: power_c and power_d using the following equations:
                          log(count_c)
      power_c = ceil( ------------------)
                          log(2)
      

      log(count_d) power_d = ceil( ------------------) log(2)

      note: ceil() and log() are functions included in "math.h". Power_c and power_d should be equal. You should check that this is the case. We will therefore use just power_c for the rest of our computations.

    3. Now, the total number of data we want to have in these two files should be the same and should be a power of two. We should be able to determine this value as follows:
      number = pow(2, power_c)

  3. If the count_c < number, you need to put (number – count_c) more 999 into "filec.dat". For example: if number=16 and count_c=10, you need to add 6 more 999 into file "filec.dat". Then close this file.

    If the count_d < number, you need to put (number – count_d) more 999 into "fileD.dat", for example: if number=16 and count_d=11, you need to add 5 more 999 into file "fileD.dat". Then close this file.

  4. Now note that both files C and D should contain the same number of data items. The total number of data items on the two files, call it n, that you have to work with in your sort, is given by 2 * number:
  5. Output "fileC.dat" and "fileD.dat" as well as n and the values of count_c and count_d. Why are these last two values important? The output of the two files you just closed is a bit tricky. How will you do this?
  6. Now you will be able to finish the merge sort (at least you will be able to do this after the next lecture).