PAMI.extras.syntheticDataGenerator package

Submodules

PAMI.extras.syntheticDataGenerator.TemporalDatabase module

class PAMI.extras.syntheticDataGenerator.TemporalDatabase.TemporalDatabase(databaseSize: int, avgItemsPerTransaction: int, numItems: int, sep: str = '\t', occurrenceProbabilityOfSameTimestamp: float = 0.1, occurrenceProbabilityToSkipSubsequentTimestamp: float = 0.1)[source]

Bases: object

create() None[source]

Create the temporal database or DataFrame based on the specified type of file.

getMemoryRSS() int[source]
getMemoryUSS() int[source]
getRuntime() float[source]

Returns the runtime of the algorithm in seconds.

getTransactions() None[source]

Convert the database to a DataFrame.

static performCoinFlip(probability: float) bool[source]

Perform a coin flip with the given probability.

Parameters:

probability – Probability of the coin landing heads (i.e., the event occurring).

Returns:

True if the coin lands heads, False otherwise.

save(outputFile: str = None) None[source]

Save the temporal database to the specified output file.

tuning(array, sumRes) list[source]

Tune the array to ensure that the sum of the values equals sumRes.

PAMI.extras.syntheticDataGenerator.TransactionalDatabase module

class PAMI.extras.syntheticDataGenerator.TransactionalDatabase.TransactionalDatabase(databaseSize, avgItemsPerTransaction, numItems, sep='\t')[source]

Bases: object

Description:

TransactionalDatabase is a collection of transactions. It only considers the data in transactions and ignores the metadata.

Attributes:
dataBaseSize: int

Number of Transactions in a database

avgItemsPerTransaction: int

Average number of items per transaction

itemsNo: int

Total number of items

memoryUSSfloat

To store the total amount of USS memory consumed by the program

memoryRSSfloat

To store the total amount of RSS memory consumed by the program

startTimefloat

To record the start time of the mining process

endTimefloat

To record the completion time of the mining process

Methods:
create:

Generate the transactional database

save:

Save the transactional database to a user-specified file

getTransactions:

Get the transactional database

getMemoryUSS()

Total amount of USS memory consumed by the mining process will be retrieved from this function

getMemoryRSS()

Total amount of RSS memory consumed by the mining process will be retrieved from this function

getRuntime()

Total amount of runtime taken by the mining process will be retrieved from this function

Methods to execute code on terminal

Format:

(.venv) $ python3 TransactionalDatabase.py <dataBaseSize> <avgItemsPerTransaction> <itemsNo>

Example Usage:

(.venv) $ python3 TransactionalDatabase.py 50.0 10.0 100

Importing this algorithm into a python program

from PAMI.extras.syntheticDataGenerator import TransactionalDatabase as db

obj = db.TransactionalDatabase(10, 5, 10)

obj.create()

obj.save(‘db.txt’)

print(obj.getTransactions())

create()[source]
getMemoryRSS() float[source]
getMemoryUSS() float[source]
getRuntime() float[source]
getTransactions()[source]
save(filename)[source]

PAMI.extras.syntheticDataGenerator.createSyntheticGeoreferentialTemporal module

PAMI.extras.syntheticDataGenerator.createSyntheticGeoreferentialTransactions module

class PAMI.extras.syntheticDataGenerator.createSyntheticGeoreferentialTransactions.createSyntheticGeoreferentialTransaction(transactions, items, avgTransaction)[source]

Bases: object

This class create synthetic geo-referential transaction database.

Attribute:
totalTransactionsint

No of transactions

itemsint

No of items

avgTransactionLengthstr

The length of average transaction

outputFile: str

Name of the output file.

Methods:
createGeoreferentialTransactionDatabase(outputFile)

Create geo-referential transactional database and store into outputFile

Credits:

The complete program was written by P.Likhitha under the supervision of Professor Rage Uday Kiran.

createGeoreferentialTransactionalDatabase(outputFile)[source]

create transactional database and return outputFileName

Parameters:

outputFile (str) – file name or path to store database

Returns:

outputFile name

PAMI.extras.syntheticDataGenerator.createSyntheticGeoreferentialUncertainTransaction module

class PAMI.extras.syntheticDataGenerator.createSyntheticGeoreferentialUncertainTransaction.createSyntheticGeoreferentialUncertainTransaction(transactions: int, items: int, avgTransaction: int)[source]

Bases: object

This class is to create synthetic geo-referential uncertain transaction database.

Attribute:
totalTransactionsint

No of transactions

noOfItemsint

No of items

avgTransactionLengthint

The length of average transaction

outputFile: str

Name of the output file.

Methods:
createGeoreferentialuncertainTransactionDatabase(outputFile)

Create geo-referential transactional database store into outputFile

Credits:

The complete program was written by P.Likhitha under the supervision of Professor Rage Uday Kiran.

createGeoreferentialUncertainTransactionalDatabase(outputFile: str) None[source]

create transactional database and return outputFileName

Parameters:

outputFile (str) – file name or path to store database

Returns:

outputFile name

PAMI.extras.syntheticDataGenerator.createSyntheticTemporal module

class PAMI.extras.syntheticDataGenerator.createSyntheticTemporal.createSyntheticTemporal(transactions: int, items: int, avgTransaction: int)[source]

Bases: object

This class create synthetic temporal database.

Attribute:
totalTransactionsint

No of transactions

noOfItemsint

No of items

avgTransactionLengthstr

The length of average transaction

outputFile: str

Name of the output file.

Methods:
createTemporallDatabase(outputFile)

Create temporal database from DataFrame and store into outputFile

Credits:

The complete program was written by P.Likhitha under the supervision of Professor Rage Uday Kiran.

createTemporalDatabase(outputFile: str) None[source]

create transactional database and return outputFileName

Parameters:

outputFile (str) – file name or path to store database

Returns:

outputFile name

PAMI.extras.syntheticDataGenerator.createSyntheticTransactions module

class PAMI.extras.syntheticDataGenerator.createSyntheticTransactions.createSyntheticTransaction(totalTransactions: int, items: int, avgTransactionLength: int)[source]

Bases: object

This class create synthetic transaction database.

Attribute:
totalTransactionsint

No of transactions

noOfItemsint

No of items

avgTransactionLengthint

The length of average transaction

outputFile: str

Name of the output file.

Methods:
createTransactionalDatabase(outputFile)

Create transactional database and store into outputFile

Credits:

The complete program was written by P.Likhitha under the supervision of Professor Rage Uday Kiran.

createTransactionalDatabase(outputFile: str) None[source]

create transactional database and return outputFileName

Parameters:

outputFile (str) – file name or path to store database

Returns:

outputFile name

PAMI.extras.syntheticDataGenerator.createSyntheticUncertainTemporal module

class PAMI.extras.syntheticDataGenerator.createSyntheticUncertainTemporal.createSyntheticUncertainTemporal(totalTransactions: int, items: int, avgTransaction: int)[source]

Bases: object

This class create synthetic temporal database.

Attribute:
totalTransactionsint

Total no of transactions

noOfItemsint

No of items

avgTransactionLengthint

The length of average transaction

outputFile: str

Name of the output file.

Methods:
createUncertainTemporalDatabase(outputFile)

Create temporal database from DataFrame and store into outputFile

Credits:

The complete program was written by P.Likhitha under the supervision of Professor Rage Uday Kiran.

createUncertainTemporalDatabase(outputFile: str) None[source]

create transactional database and return outputFileName

Parameters:

outputFile (str) – file name or path to store database

Returns:

outputFile name

PAMI.extras.syntheticDataGenerator.createSyntheticUncertainTransactions module

class PAMI.extras.syntheticDataGenerator.createSyntheticUncertainTransactions.createSyntheticUncertainTransaction(transactions: int, items: int, avgTransaction: int)[source]

Bases: object

This class create synthetic transaction database.

Attribute:
totalTransactionsint

No of transactions

noOfItemsint

No of items

avgTransactionLengthstr

The length of average transaction

outputFile: str

Name of the output file.

Methods:
createUncertainTransactionalDatabase(outputFile)

Create uncertain transactional database and store into outputFile

Credits:

The complete program was written by P.Likhitha under the supervision of Professor Rage Uday Kiran.

createUncertainTransactionalDatabase(outputFile: str) None[source]

create transactional database and return outputFileName

Parameters:

outputFile (str) – file name or path to store database

Returns:

outputFile name

PAMI.extras.syntheticDataGenerator.createSyntheticUtility module

class PAMI.extras.syntheticDataGenerator.createSyntheticUtility.createSyntheticUtility(transactions: int, items: int, maxUtilRange: int, avgTransaction: int)[source]

Bases: object

This class create synthetic utility database.

Attribute:
totalTransactionsint

No of transactions

noOfItemsint

No of items

maxUtilRange: int

Maximum utility range

avgTransactionLengthint

The length of average transaction

outputFile: str

Name of the output file.

Methods:
createUtilityDatabase(outputFile)

Create utility database from DataFrame and store into outputFile

Credits:

The complete program was written by P.Likhitha under the supervision of Professor Rage Uday Kiran.

createUtilityDatabase(outputFile: str) None[source]

create transactional database and return outputFileName

Parameters:

outputFile (str) – file name or path to store database

Returns:

outputFile name

PAMI.extras.syntheticDataGenerator.fuzzyDatabase module

PAMI.extras.syntheticDataGenerator.generateTemporal module

class PAMI.extras.syntheticDataGenerator.generateTemporal.generateTemporal(numOfTransactions: int, avgLenOfTransactions: float, numItems: int, outputFile: str, percentage: int = 50, sep: str = '\t', typeOfFile: str = 'Database')[source]

Bases: object

Description:

generateTemporalDatabase creates a temporal database and outputs a database or a frame depending on input

Attributes:
param numOfTransactions:

int number of transactions

param avgLenOfTransactions:

float average length of transactions

param numItems:

int number of items

param outputFile:

str output file name

param percentage:

int percentage of coinToss for TID of temporalDatabase

param sep:

str seperator for database output file

param typeOfFile:

str specify database or dataframe to get corresponding output

Methods:
getFileName():

returns filename

createTemporalFile():

creates temporal database file or dataframe

getDatabaseAsDataFrame:

returns dataframe

performCoinFlip():

Perform a coin flip with the given probability

tuning():

Tune the arrayLength to match avgLenOfTransactions

createTemporalFile():

create Temporal database or dataframe depending on input

Importing this algorithm into a python program

from PAMI.extras.generateDatabase import generateTemporalDatabase as db

numOfTransactions = 100
numItems = 15
avgTransactionLength = 6
outFileName = 'temporal_ot.txt'
sep = '     '
percent = 75
frameOrBase = "dataframe" # if you want to get dataframe as output
frameOrBase = "database" # if you want to get database/csv/file as output

temporalDB = db.generateTemporalDatabase(numOfTransactions, avgTransactionLength, numItems, outFileName, percent, sep, frameOrBase )
temporalDB.createTemporalFile()
print(temporalDB.getDatabaseAsDataFrame())
createTemporalFile() None[source]

create Temporal database or dataframe depending on input :return: None

generateArray(nums, maxItems, sumRes) ndarray[source]

Generate a random array of length n whose values average to m

Parameters:
  • nums (int) – number of values

  • maxItems (int) – maximum value

  • sumRes (int) – Resultant sum

Returns:

random array

Return type:

numpy array

getFileName() str[source]

return filename :return: filename :rtype: str

getTransactions() DataFrame[source]

return dataframe :return: dataframe :rtype: pd.DataFrame

static performCoinFlip(probability: float) bool[source]

Perform a coin flip with the given probability. :param probability: probability to perform coin flip :type probability: float :return: True if coin flip is performed, False otherwise :rtype: bool

save(sep, filename) None[source]

Save the transactional database to a file

Parameters:
  • sep (str) – Separator

  • filename (str) – name of the file

Returns:

None

static tuning(array, sumRes) ndarray[source]

Tune the array so that the sum of the values is equal to sumRes

Parameters:
  • array (numpy.ndarray) – list of values

  • sumRes (int) – the sum of the values in the array to be tuned

Returns:

list of values with the tuned values and the sum of the values in the array to be tuned and sumRes is equal to sumRes

Return type:

numpy.ndarray

PAMI.extras.syntheticDataGenerator.generateTransactional module

class PAMI.extras.syntheticDataGenerator.generateTransactional.generateTransactional(numLines, avgItemsPerLine, numItems)[source]

Bases: object

:Description Generate a transactional database with the given number of lines, average number of items per line, and total number of items

Attributes:

numLines: int
  • number of lines

avgItemsPerLine: float
  • average number of items per line

numItems: int
  • total number of items

Methods:
create:

Generate the transactional database

save:

Save the transactional database to a file

getTransactions:

Get the transactional database

create() None[source]

Generate the transactional database :return: None

generateArray(nums, avg, maxItems) ndarray[source]

Generate a random array of length n whose values average to m

Parameters:
  • nums (int) – number of values

  • avg (int) – average value

  • maxItems (int) – maximum value

Returns:

random array

Return type:

numpy.ndarray

getTransactions() DataFrame[source]

Get the transactional database

Returns:

the transactional database

Return type:

pd.DataFrame

save(sep, filename) None[source]

Save the transactional database to a file

Parameters:
  • sep (str) – separator

  • filename (str) – name of the file

Returns:

None

static tuning(array, sumRes) ndarray[source]

Tune the array so that the sum of the values is equal to sumRes

Parameters:
  • array (numpy.ndarray) – list of values

  • sumRes (int) – the sum of the values in the array to be tuned

Returns:

list of values with the tuned values and the sum of the values in the array to be tuned and sumRes is equal to sumRes

Return type:

numpy.ndarray

PAMI.extras.syntheticDataGenerator.generateUncertainTemporal module

class PAMI.extras.syntheticDataGenerator.generateUncertainTemporal.generateUncertainTemporal(transactionSize: int, numOfItems: int, avgTransactionLength: int, significant=2)[source]

Bases: object

generate() None[source]
save(outputFile: str, sep='\t') None[source]

PAMI.extras.syntheticDataGenerator.generateUncertainTransactional module

class PAMI.extras.syntheticDataGenerator.generateUncertainTransactional.generateUncertainTransactional(transactionSize: int, numOfItems: int, avgTransactionLength: int, significant=2)[source]

Bases: object

generate() None[source]
save(outputFile: str, sep='\t') None[source]

PAMI.extras.syntheticDataGenerator.generateUtilityTemporal module

class PAMI.extras.syntheticDataGenerator.generateUtilityTemporal.generateUtilityTemporal(transactionSize: int, numOfItems: int, avgTransactionLength: int, minUtilityValue: int, maxUtilityValue: int, minNumOfTimesAnItem: int, maxNumOfTimesAnItem: int)[source]

Bases: object

generate() None[source]
save(outputFile: str, sep='\t', utilityType='utility') None[source]

PAMI.extras.syntheticDataGenerator.generateUtilityTransactional module

class PAMI.extras.syntheticDataGenerator.generateUtilityTransactional.generateUtilityTransactional(transactionSize: int, numOfItems: int, avgTransactionLength: int, minUtilityValue: int, maxUtilityValue: int, minNumOfTimesAnItem: int, maxNumOfTimesAnItem: int)[source]

Bases: object

generate() None[source]
save(outputFile: str, sep='\t', utilityType='utility') None[source]

PAMI.extras.syntheticDataGenerator.georeferencedTemporalDatabase module

PAMI.extras.syntheticDataGenerator.georeferencedTransactionalDatabase module

PAMI.extras.syntheticDataGenerator.syntheticUtilityDatabase module

class PAMI.extras.syntheticDataGenerator.syntheticUtilityDatabase.syntheticUtilityDatabase(totalTransactions: int, numOfItems: int, maxUtilRange: int, avgTransactionLength: int)[source]

Bases: object

This class creates a synthetic utility database.

totalTransactions

int Number of transactions.

numOfItems

int Number of items.

maxUtilRange

int Maximum utility range.

avgTransactionLength

int The length of average transaction.

_memoryUSS

float To store the total amount of USS memory consumed by the program

_memoryRSS

float To store the total amount of RSS memory consumed by the program

_startTime

float To record the start time of the mining process

_endTime

float To record the completion time of the mining process

__init__(totalTransactions, numOfItems, maxUtilRange, avgTransactionLength)[source]

Constructor to initialize the database parameters.

createSyntheticUtilityDatabase(outputFile)[source]

Create utility database and store it in the specified output file.

createRandomNumbers(n, targetSum)[source]

Generate a list of random numbers with a specified target sum.

save(outputFile)[source]

Save the generated utility database to a CSV file.

getMemoryUSS()[source]

Total amount of USS memory consumed by the mining process will be retrieved from this function

getMemoryRSS()[source]

Total amount of RSS memory consumed by the mining process will be retrieved from this function

getRuntime()[source]

Total amount of runtime taken by the mining process will be retrieved from this function

Credits:

The complete program was written by A.Hemanth sree sai under the supervision of Professor Rage Uday Kiran.

static createRandomNumbers(n: int, targetSum: int) list[float][source]

Generate a list of random numbers with a specified target sum.

Parameters:
  • n (int) – Number of random numbers to generate.

  • targetSum (int) – Target sum for the generated random numbers.

Returns:

List of generated random numbers normalized and multiplied by the target sum.

Return type:

list

createSyntheticUtilityDatabase(outputFile: str) None[source]

Create utility database and store it in the specified output file.

Parameters:

outputFile (str) – File name or path to store the database.

getMemoryRSS() float[source]

Total amount of RSS memory consumed by the mining process will be retrieved from this function

Returns:

returning RSS memory consumed by the mining process

Return type:

float

getMemoryUSS() float[source]

Total amount of USS memory consumed by the mining process will be retrieved from this function

Returns:

returning USS memory consumed by the mining process

Return type:

float

getRuntime() float[source]

Calculating the total amount of runtime taken by the mining process

Returns:

returning total amount of runtime taken by the mining process

Return type:

float

save(outputFile: str) None[source]

Save the generated utility database to a CSV file.

Parameters:

outputFile (str) – File name or path to store the CSV file.

PAMI.extras.syntheticDataGenerator.temporalDatabaseGen module

PAMI.extras.syntheticDataGenerator.utilityDatabase module

Module contents