CFPGrowthPlus

class PAMI.multipleMinimumSupportBasedFrequentPattern.basic.CFPGrowthPlus.CFPGrowthPlus(iFile, MIS, sep='\t')[source]

Bases: _frequentPatterns

About this algorithm

Description:

This code implements the CFPGrowthPlus algorithm for mining frequent patterns with multiple minimum support thresholds from a transactional dataset.

Reference:

R. Uday Kiran P. Krishna Reddy Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms. 11-20 2011 EDBT https://doi.org/10.1145/1951365.1951370

Parameters:
  • iFile (str) – Name of the Input file to mine complete set of Uncertain Multiple Minimum Support Based Frequent patterns.
    • oFile (str) – Name of the output file to store complete set of Uncertain Minimum Support Based Frequent patterns.

    • MIS (str) – Name of the MIS file to mine complete set of Uncertain Multiple Minimum Support Based Frequent patterns.

    • sep (str) – This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.

Attributes:
  • startTime (float) – To record the start time of the mining process.

  • endTime (float) – To record the completion time of the mining process.

  • finalPatterns (dict) – Storing the complete set of patterns in a dictionary variable.

  • memoryUSS (float) – To store the total amount of USS memory consumed by the program.

  • memoryRSS (float) – To store the total amount of RSS memory consumed by the program.

  • Database (list) – To store the transactions of a database in list.

  • mapSupport (Dictionary) – To maintain the information of item and their frequency.

  • tree (class) – it represents the Tree class.

Methods:
  • mine()Mining process will start from here.

  • getPatterns()Complete set of patterns will be retrieved with this function.

  • savePatterns(oFile)Complete set of frequent patterns will be loaded in to a output file.

  • getPatternsAsDataFrame()Complete set of frequent patterns will be loaded in to a dataframe.

  • getMemoryUSS()Total amount of USS memory consumed by the mining process will be retrieved from this function.

  • getMemoryRSS()Total amount of RSS memory consumed by the mining process will be retrieved from this function.

  • getRuntime()Total amount of runtime taken by the mining process will be retrieved from this function.

  • creatingItemSets()Scans the dataset or dataframes and stores in list format.

  • frequentOneItem()Extracts the one-frequent patterns from transactions.

Execution methods

Terminal command

Format:

(.venv) $ python3 CFPGrowthPlus.py <inputFile> <outputFile> <MISFile>

Example Usage:

(.venv) $ python3 CFPGrowthPlus.py sampleDB.txt patterns.txt MISFile.txt

  .. note:: minSup  will be considered in support count or frequency

Calling from a python program

from PAMI.multipleMinimumSupportBasedFrequentPattern.basic import CFPGrowthPlus as alg

iFile = "sample.txt"

MIS = "MIS.txt"

obj = alg.CFPGrowthPlus(iFile, MIS, sep)

obj.mine()

frequentPatterns = obj.getPatterns()

print("Total number of Frequent Patterns:", len(frequentPatterns))

obj.savePatterns(oFile)

Df = obj.getPatternInDataFrame()

memUSS = obj.getMemoryUSS()

print("Total Memory in USS:", memUSS)

memRSS = obj.getMemoryRSS()

print("Total Memory in RSS", memRSS)

run = obj.getRuntime()

print("Total ExecutionTime in seconds:", run)

Credits

The complete program was written by P.Likhitha under the supervision of Professor Rage Uday Kiran.

getMemoryRSS()[source]

Total amount of RSS memory consumed by the mining process will be retrieved from this function

Returns:

returning RSS memory consumed by the mining process

Return type:

float

getMemoryUSS()[source]

Total amount of USS memory consumed by the mining process will be retrieved from this function

Returns:

returning USS memory consumed by the mining process

Return type:

float

getPatterns()[source]

Function to send the set of frequent patterns after completion of the mining process

Returns:

returning frequent patterns

Return type:

dict

getPatternsAsDataFrame()[source]

Storing final frequent patterns in a dataframe

Returns:

returning frequent patterns in a dataframe

Return type:

pd.DataFrame

getRuntime()[source]

Calculating the total amount of runtime taken by the mining process

Returns:

returning total amount of runtime taken by the mining process

Return type:

float

mine()[source]

main program to start the operation

printResults() None[source]

this function is used to print the results :return: None

save(outFile)[source]

Complete set of frequent patterns will be loaded in to a output file

Parameters:

outFile (file) – name of the output file

startMine()[source]

main program to start the operation