PAMI.correlatedPattern.basic package

Submodules

PAMI.correlatedPattern.basic.CoMine module

class PAMI.correlatedPattern.basic.CoMine.CoMine(iFile: str | DataFrame, minSup: int | float | str, minAllConf: float, sep: str = '\t')[source]

Bases: _correlatedPatterns

About this algorithm

Description:

CoMine is one of the fundamental algorithm to discover correlated patterns in a transactional database. It is based on the traditional FP-Growth algorithm. This algorithm uses depth-first search technique to find all correlated patterns in a transactional database.

Reference:

Lee, Y.K., Kim, W.Y., Cao, D., Han, J. (2003). CoMine: efficient mining of correlated patterns. In ICDM (pp. 581–584).

Parameters:
  • iFile (str) – Name of the Input file to mine complete set of correlated patterns

  • oFile (str) – Name of the output file to store complete set of correlated patterns

  • minSup (int or float or str) – The user can specify minSup either in count or proportion of database size. If the program detects the data type of minSup is integer, then it treats minSup is expressed in count.

  • minAllConf (float) – The user can specify minAllConf values within the range (0, 1).

  • sep (str) – This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.

Attributes:
  • memoryUSS (float) – To store the total amount of USS memory consumed by the program.

  • memoryRSS (float) – To store the total amount of RSS memory consumed by the program.

  • startTime (float) – To record the start time of the mining process.

  • endTime (float) – To record the completion time of the mining process.

  • minSup (int) – The user given minSup.

  • minAllConf (float) – The user given minimum all confidence Ratio(should be in range of 0 to 1).

  • Database (list) – To store the transactions of a database in list.

  • mapSupport (Dictionary) – To maintain the information of item and their frequency.

  • lno (int) – it represents the total no of transactions.

  • tree (class) – it represents the Tree class.

  • itemSetCount (int) – it represents the total no of patterns.

  • finalPatterns (dict) – it represents to store the patterns.

  • itemSetBuffer (list) – it represents the store the items in mining.

  • maxPatternLength (int) – it represents the constraint for pattern length.

Execution methods

Terminal command

Format:

(.venv) $ python3 CoMine.py <inputFile> <outputFile> <minSup> <minAllConf> <sep>

Example Usage:

(.venv) $ python3 CoMine.py sampleTDB.txt output.txt 0.25 0.2

Note

minSup can be specified in support count or a value between 0 and 1.

Calling from a python program

from PAMI.correlatedPattern.basic import CoMine as alg

iFile = 'sampleTDB.txt'

minSup = 0.25 # can be specified between 0 and 1

minAllConf = 0.2 # can  be specified between 0 and 1

obj = alg.CoMine(iFile, minSup, minAllConf,sep)

obj.mine()

patterns = obj.getPatterns()

print("Total number of  Patterns:", len(patterns))

obj.savePatterns(oFile)

df = obj.getPatternsAsDataFrame()

memUSS = obj.getMemoryUSS()

print("Total Memory in USS:", memUSS)

memRSS = obj.getMemoryRSS()

print("Total Memory in RSS", memRSS)

run = obj.getRuntime()

print("Total ExecutionTime in seconds:", run)

Credits

The complete program was written by B.Sai Chitra and revised by Tarun Sreepada under the supervision of Professor Rage Uday Kiran.

getMemoryRSS() float[source]

Total amount of RSS memory consumed by the mining process will be retrieved from this function

Returns:

returning RSS memory consumed by the mining process

Return type:

float

getMemoryUSS() float[source]

Total amount of USS memory consumed by the mining process will be retrieved from this function

Returns:

returning USS memory consumed by the mining process

Return type:

float

getPatterns() Dict[Tuple[int], List[int | float]][source]

Function to send the set of correlated patterns after completion of the mining process

Returns:

returning correlated patterns

Return type:

dict

getPatternsAsDataFrame() DataFrame[source]

Storing final correlated patterns in a dataframe

Returns:

returning correlated patterns in a dataframe

Return type:

pd.DataFrame

getRuntime() float[source]

Calculating the total amount of runtime taken by the mining process

Returns:

returning total amount of runtime taken by the mining process

Return type:

float

mine() None[source]

main method to start

printResults() None[source]

function to print the result after completing the process

Returns:

None

recursive(item, nodes, root)[source]

Recursively build the tree structure for itemsets and find patterns that meet the minimum support and all-confidence thresholds.

Parameters:
  • item (Any) – The current item being processed.

  • nodes (list of _Node) – The list of nodes to be processed.

  • root (_Node) – The root node of the current tree.

Returns:

None

save(outFile) None[source]

Complete set of correlated patterns will be saved into an output file

Parameters:

outFile (file) – name of the outputfile

Returns:

None

startMine() None[source]

main method to start

PAMI.correlatedPattern.basic.CoMinePlus module

class PAMI.correlatedPattern.basic.CoMinePlus.CoMinePlus(iFile: str | DataFrame, minSup: int | float | str, minAllConf: float, sep: str = '\t')[source]

Bases: _correlatedPatterns

About this algorithm

Description:

CoMinePlus is one of the fundamental algorithm to discover correlated patterns in a transactional database. It is based on the traditional FP-Growth algorithm. This algorithm uses depth-first search technique to find all correlated patterns in a transactional database.

Reference:

Lee, Y.K., Kim, W.Y., Cao, D., Han, J. (2003). CoMine: efficient mining of correlated patterns. In ICDM (pp. 581–584).

Parameters:
  • iFile (str) – Name of the Input file to mine complete set of correlated patterns.

  • oFile (str) – Name of the output file to store complete set of correlated patterns.

  • minSup (int or float or str) – The user can specify minSup either in count or proportion of database size. If the program detects the data type of minSup is integer, then it treats minSup is expressed in count.

  • minAllConf (float) – The user can specify minAllConf values within the range (0, 1).

  • sep (str) – This variable is used to distinguish items from one another in a transaction. The default seperator is tab space. However, the users can override their default separator.

Attributes:
  • memoryUSS (float) – To store the total amount of USS memory consumed by the program.

  • memoryRSS (float) – To store the total amount of RSS memory consumed by the program.

  • startTime (float) – To record the start time of the mining process.

  • endTime (float) – To record the completion time of the mining process.

  • minSup (int) – The user given minSup.

  • minAllConf (float) – The user given minimum all confidence Ratio(should be in range of 0 to 1).

  • Database (list) – To store the transactions of a database in list.

  • mapSupport (Dictionary) – To maintain the information of item and their frequency.

  • lno (int) – it represents the total no of transactions.

  • tree (class) – it represents the Tree class.

  • itemSetCount (int) – it represents the total no of patterns.

  • finalPatterns (dict) – it represents to store the patterns.

  • itemSetBuffer (list) – it represents the store the items in mining.

  • maxPatternLength (int) – it represents the constraint for pattern length.

Execution methods

Terminal command

Format:

(.venv) $ python3 CoMinePlus.py <inputFile> <outputFile> <minSup> <minAllConf> <sep>

Example Usage:

(.venv) $ python3 CoMinePlus.py sampleTDB.txt output.txt 0.25 0.2

Note

minSup can be specified in support count or a value between 0 and 1.

Calling from a python program

from PAMI.correlatedPattern.basic import CoMinePlus as alg

iFile = 'sampleTDB.txt'

minSup = 0.25 # can be specified between 0 and 1

minAllConf = 0.2 # can  be specified between 0 and 1

obj = alg.CoMinePlus(iFile, minSup, minAllConf,sep)

obj.mine()

frequentPatterns = obj.getPatterns()

print("Total number of  Patterns:", len(frequentPatterns))

obj.savePatterns(oFile)

df = obj.getPatternsAsDataFrame()

memUSS = obj.getMemoryUSS()

print("Total Memory in USS:", memUSS)

memRSS = obj.getMemoryRSS()

print("Total Memory in RSS", memRSS)

run = obj.getRuntime()

print("Total ExecutionTime in seconds:", run)

Credits

The complete program was written by B.Sai Chitra and revised by Tarun Sreepada under the supervision of Professor Rage Uday Kiran.

getMemoryRSS() float[source]

Total amount of RSS memory consumed by the mining process will be retrieved from this function

Returns:

returning RSS memory consumed by the mining process

Return type:

float

getMemoryUSS() float[source]

Total amount of USS memory consumed by the mining process will be retrieved from this function

Returns:

returning USS memory consumed by the mining process

Return type:

float

getPatterns() Dict[Tuple[int], List[int | float]][source]

Function to send the set of correlated patterns after completion of the mining process

Returns:

returning correlated patterns

Return type:

dict

getPatternsAsDataFrame() DataFrame[source]

Storing final correlated patterns in a dataframe

Returns:

returning correlated patterns in a dataframe

Return type:

pd.DataFrame

getRuntime() float[source]

Calculating the total amount of runtime taken by the mining process

Returns:

returning total amount of runtime taken by the mining process

Return type:

float

mine() None[source]

main method to start

printResults() None[source]

function to print the result after completing the process

Returns:

None

recursive(item, nodes, root)[source]

Recursively build the tree structure for itemsets and find patterns that meet the minimum support and all-confidence thresholds.

Parameters:
  • item (Any) – The current item being processed.

  • nodes (list of _Node) – The list of nodes to be processed.

  • root (_Node) – The root node of the current tree.

Returns:

None

save(outFile) None[source]

Complete set of correlated patterns will be saved into an output file

Parameters:

outFile (file) – name of the outputfile

Returns:

None

startMine() None[source]

Code for the mining process will start from this function

PAMI.correlatedPattern.basic.abstract module

Module contents