CFPGrowthPlus
- class PAMI.multipleMinimumSupportBasedFrequentPattern.basic.CFPGrowthPlus.CFPGrowthPlus(iFile, MIS, sep='\t')[source]
Bases:
_frequentPatternsAbout this algorithm
- Description:
This code implements the CFPGrowthPlus algorithm for mining frequent patterns with multiple minimum support thresholds from a transactional dataset.
- Reference:
R. Uday Kiran P. Krishna Reddy Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms. 11-20 2011 EDBT https://doi.org/10.1145/1951365.1951370
- Parameters:
- iFile (str) – Name of the Input file to mine complete set of Uncertain Multiple Minimum Support Based Frequent patterns.
oFile (str) – Name of the output file to store complete set of Uncertain Minimum Support Based Frequent patterns.
MIS (str) – Name of the MIS file to mine complete set of Uncertain Multiple Minimum Support Based Frequent patterns.
sep (str) – This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.
- Attributes:
startTime (float) – To record the start time of the mining process.
endTime (float) – To record the completion time of the mining process.
finalPatterns (dict) – Storing the complete set of patterns in a dictionary variable.
memoryUSS (float) – To store the total amount of USS memory consumed by the program.
memoryRSS (float) – To store the total amount of RSS memory consumed by the program.
Database (list) – To store the transactions of a database in list.
mapSupport (Dictionary) – To maintain the information of item and their frequency.
tree (class) – it represents the Tree class.
- Methods:
mine() – Mining process will start from here.
getPatterns() – Complete set of patterns will be retrieved with this function.
savePatterns(oFile) – Complete set of frequent patterns will be loaded in to a output file.
getPatternsAsDataFrame() – Complete set of frequent patterns will be loaded in to a dataframe.
getMemoryUSS() – Total amount of USS memory consumed by the mining process will be retrieved from this function.
getMemoryRSS() – Total amount of RSS memory consumed by the mining process will be retrieved from this function.
getRuntime() – Total amount of runtime taken by the mining process will be retrieved from this function.
creatingItemSets() – Scans the dataset or dataframes and stores in list format.
frequentOneItem() – Extracts the one-frequent patterns from transactions.
Execution methods
Terminal command
Format: (.venv) $ python3 CFPGrowthPlus.py <inputFile> <outputFile> <MISFile> Example Usage: (.venv) $ python3 CFPGrowthPlus.py sampleDB.txt patterns.txt MISFile.txt .. note:: minSup will be considered in support count or frequency
Calling from a python program
from PAMI.multipleMinimumSupportBasedFrequentPattern.basic import CFPGrowthPlus as alg iFile = "sample.txt" MIS = "MIS.txt" obj = alg.CFPGrowthPlus(iFile, MIS, sep) obj.mine() frequentPatterns = obj.getPatterns() print("Total number of Frequent Patterns:", len(frequentPatterns)) obj.savePatterns(oFile) Df = obj.getPatternInDataFrame() memUSS = obj.getMemoryUSS() print("Total Memory in USS:", memUSS) memRSS = obj.getMemoryRSS() print("Total Memory in RSS", memRSS) run = obj.getRuntime() print("Total ExecutionTime in seconds:", run)
Credits
The complete program was written by P.Likhitha under the supervision of Professor Rage Uday Kiran.
- getMemoryRSS()[source]
Total amount of RSS memory consumed by the mining process will be retrieved from this function
- Returns:
returning RSS memory consumed by the mining process
- Return type:
float
- getMemoryUSS()[source]
Total amount of USS memory consumed by the mining process will be retrieved from this function
- Returns:
returning USS memory consumed by the mining process
- Return type:
float
- getPatterns()[source]
Function to send the set of frequent patterns after completion of the mining process
- Returns:
returning frequent patterns
- Return type:
dict
- getPatternsAsDataFrame()[source]
Storing final frequent patterns in a dataframe
- Returns:
returning frequent patterns in a dataframe
- Return type:
pd.DataFrame
- getRuntime()[source]
Calculating the total amount of runtime taken by the mining process
- Returns:
returning total amount of runtime taken by the mining process
- Return type:
float