PAMI.subgraphMining.topK package

Submodules

PAMI.subgraphMining.topK.DFSCode module

class PAMI.subgraphMining.topK.DFSCode.DfsCode[source]

Bases: object

add(ee)[source]

Adds an extended edge to the DFSCode object.

Parameters:

ee (ExtendedEdge) – The extended edge to add.

containEdge(v1, v2)[source]

Checks if an edge between vertices v1 and v2 is present.

Parameters:
  • v1 (int) – The first vertex of the edge.

  • v2 (int) – The second vertex of the edge.

Returns:

True if the edge is present, False otherwise.

Return type:

bool

copy()[source]

Creates a deep copy of the DFSCode object.

getAllVLabels()[source]

Retrieves all vertex labels from the extended edge list.

Returns:

A list of vertex labels.

Return type:

list

getAt(index)[source]

Retrieves the extended edge at the specified index.

Parameters:

index (int) – The index of the extended edge to retrieve.

Returns:

The extended edge at the specified index.

Return type:

ExtendedEdge

getEeList()[source]

Retrieves the extended edge list.

Returns:

A list of ExtendedEdge objects.

Return type:

list

getRightMost()[source]

Retrieves the rightmost vertex.

Returns:

The index of the rightmost vertex.

Return type:

int

getRightMostPath()[source]

Retrieves the rightmost path.

Returns:

A list containing the vertices on the rightmost path.

Return type:

list

isEmpty()[source]

Checks if the DFSCode object is empty.

Returns:

True if the DFSCode object is empty, False otherwise.

Return type:

bool

notPreOfRm(v)[source]

Checks if a given vertex is not the second-to-last element on the rightmost path.

Parameters:

v (int) – The vertex to check.

Returns:

True if the vertex is not the second-to-last element on the rightmost path, False otherwise.

Return type:

bool

onRightMostPath(v)[source]

Checks if a vertex is present on the rightmost path.

Parameters:

v (int) – The vertex to check.

Returns:

True if the vertex is present on the rightmost path, False otherwise.

Return type:

bool

PAMI.subgraphMining.topK.DFSThread module

class PAMI.subgraphMining.topK.DFSThread.DfsThread(graphDb, candidates, minSup, tkgInstance)[source]

Bases: Thread

A thread class for performing DFS-based subgraph mining on a set of candidates.

Parameters:
  • graphDb (GraphDatabase) – The graph database containing the input graphs.

  • candidates (Queue) – A queue containing candidate subgraphs to be mined.

  • minSup (int) – The minimum support threshold for frequent subgraph mining.

  • tkgInstance (TKGInstance) – An instance of the Top-K Graphs (TKG) algorithm for dynamic subgraph mining.

run()[source]

Runs the DFS-based subgraph mining process on the candidate subgraphs.

This method is invoked when the thread is started using the start() method.

Within the mining loop: - Retrieves a candidate subgraph from the candidates queue. - Checks if the candidate’s support meets the minimum support threshold. - If the support is sufficient, invokes the gspanDynamicDFS method of the TKGInstance

to perform dynamic DFS-based subgraph mining with the candidate’s DFS code, the graph database, and the set of graph IDs associated with the candidate.

This method continues to run until the candidates queue is empty or until a candidate with insufficient support is encountered.

PAMI.subgraphMining.topK.abstract module

PAMI.subgraphMining.topK.edge module

class PAMI.subgraphMining.topK.edge.Edge(v1, v2, edgeLabel)[source]

Bases: object

another(v)[source]

Returns the other vertex of the edge given one vertex.

getEdgeLabel()[source]

Retrieves the label of the edge.

PAMI.subgraphMining.topK.extendedEdge module

class PAMI.subgraphMining.topK.extendedEdge.ExtendedEdge(v1, v2, vLabel1, vLabel2, edgeLabel)[source]

Bases: object

getEdgeLabel()[source]

Retrieves the label of the edge.

getV1()[source]

Retrieves the index of the first vertex.

getV2()[source]

Retrieves the index of the second vertex.

getVLabel1()[source]

Retrieves the label of the first vertex.

getVLabel2()[source]

Retrieves the label of the second vertex.

pairSmallerThan(x1, x2, y1, y2)[source]

Determines if one pair of vertices is smaller than another pair.

smallerThan(that)[source]

Checks if this extended edge is smaller than another extended edge.

smallerThanOriginal(that)[source]

Checks if this extended edge is smaller than another extended edge based on original conditions.

PAMI.subgraphMining.topK.frequentSubgraph module

class PAMI.subgraphMining.topK.frequentSubgraph.FrequentSubgraph(dfsCode, setOfGraphsIds, support)[source]

Bases: object

PAMI.subgraphMining.topK.graph module

class PAMI.subgraphMining.topK.graph.Graph(iD, vMap=None, dfsCode=None)[source]

Bases: object

Represents a graph structure composed of vertices and edges.

EMPTY_VERTEX_LIST

An empty list used as a default value for vertex lists.

Type:

list

EMPTY_INTEGER_ARRAY

An empty list used as a default value for integer arrays.

Type:

list

EMPTY_INTEGER_ARRAY = []
EMPTY_VERTEX_LIST = []
findAllWithLabel(targetLabel)[source]

Finds all vertices with a specified label.

getAllNeighbors(v)[source]

Retrieves all neighbors of a vertex.

getAllVertices()[source]

Retrieves all vertices in the graph.

getEdge(v1, v2)[source]

Retrieves the edge between two vertices.

getEdgeCount()[source]

Retrieves the total number of edges in the graph.

getEdgeLabel(v1, v2)[source]

Retrieves the label of an edge between two vertices.

getId()[source]

Retrieves the ID of the graph.

getNonPrecalculatedAllVertices()[source]

Retrieves all vertices that have not been precalculated.

getVLabel(v)[source]

Retrieves the label of a vertex.

isNeighboring(v1, v2)[source]

Checks if two vertices are neighbors.

precalculateLabelsToVertices()[source]

Precalculates and caches a mapping of labels to vertex IDs.

precalculateVertexList()[source]

Precalculates and caches the list of vertices.

precalculateVertexNeighbors()[source]

Precalculates and caches the neighbor vertices for each vertex.

removeInfrequentLabel(label)[source]

Removes vertices with infrequent labels and their associated edges.

PAMI.subgraphMining.topK.sparseTriangularMatrix module

class PAMI.subgraphMining.topK.sparseTriangularMatrix.SparseTriangularMatrix[source]

Bases: object

Represents a sparse triangular matrix structure for storing counts or support values.

matrix

A dictionary representing the sparse matrix structure.

Type:

dict

getSupportForItems(i, j)[source]

Retrieves the support value for items at positions (i, j) in the matrix.

incrementCount(i, j)[source]

Increments the count or support value at the position (i, j) in the matrix.

removeInfrequentEntriesFromMatrix(minsup)[source]

Removes entries from the matrix with support values below a specified minimum support threshold.

setSupport(i, j, support)[source]

Sets the support value for items at positions (i, j) in the matrix.

PAMI.subgraphMining.topK.tkg module

class PAMI.subgraphMining.topK.tkg.TKG(iFile, k, maxNumberOfEdges=inf, outputSingleVertices=True, outputGraphIds=False)[source]

Bases: _TKG

EDGE_COUNT_PRUNING = True
ELIMINATE_INFREQUENT_EDGE_LABELS = True
ELIMINATE_INFREQUENT_VERTEX_PAIRS = True
ELIMINATE_INFREQUENT_VERTICES = True
class Pair(x, y)[source]

Bases: object

findAllOnlyOneVertex(graphDB, outputFrequentVertices)[source]
gSpan(graphDB, outputFrequentVertices)[source]

The main gSpan function to find frequent subgraphs. :param graphDb: The database of graphs to mine. :param outputFrequentVertices: Boolean indicating whether to output single vertices as subgraphs.

getKSubgraphs()[source]

Return the formatted subgraphs as a single string with correct formatting and newlines.

getMemoryRSS()[source]
getMemoryUSS()[source]
getMinSupport()[source]
getQueueSize(queue)[source]
getRuntime()[source]
getSubgraphsList()[source]

Creates a copy of the queue’s contents without emptying the original queue.

get_label(label_char)[source]

Maps a character vertex label to a unique integer. If the label is already mapped, returns the existing integer. Otherwise, assigns a new integer to the label.

gspanDfs(c: DfsCode, graphDB, subgraphId)[source]
gspanDynamicDFS(c, graphDB, graphIds)[source]
isCanonical(c: DfsCode)[source]
mine()[source]

This Python function starts a mining process on a graph database, calculates runtime, pattern count, and memory usage metrics.

readGraphs(path)[source]

Reads graph data from a file and constructs a list of graphs with vertices and edges. Handles character vertex labels by mapping them to unique integers. Edge labels are assumed to be integers.

registerAsCandidate(subgraph)[source]
removeInfrequentVertexPairs(graphDB)[source]
rightMostPathExtensions(c, graphDB, graphIds)[source]
rightMostPathExtensionsFromSingle(c, g)[source]
save(oFile)[source]

Saves the frequent subgraphs to an output file. Converts integer vertex labels back to their original characters.

savePattern(subgraph)[source]
startThreads(graphDB, candidates, minSup)[source]
subgraphIsomorphisms(c, g)[source]

PAMI.subgraphMining.topK.vertex module

class PAMI.subgraphMining.topK.vertex.Vertex(iD, vLabel)[source]

Bases: object

Represents a vertex in a graph.

id

The identifier of the vertex.

Type:

int

vLabel

The label associated with the vertex.

eList

A list of edges connected to the vertex.

Type:

list

addEdge(edge)[source]

Adds an edge to the vertex’s edge list.

getEdgeList()[source]

Retrieves the list of edges connected to the vertex.

getId()[source]

Retrieves the identifier of the vertex.

getLabel()[source]

Retrieves the label of the vertex.

removeEdge(edgeToRemove)[source]

Removes a specified edge from the vertex’s edge list.

Module contents