brain_loop_search.search
This module for now involves 2 loop search algorithms. Because of the great complexity this task can have, such loop searching can only be done in a screening way. Thus, they might not be able to give every possible results, but it's good to make a combination of different strategies and use multiple modailty of data to make up for this.
Shortest Path Methods
One method is based on shortest path that will typically get you a series of loops made up of nodes. The shortest path integrates scores on the path so the smaller the score is the better the loop can be. And the scores are actually addable. These are the basic requirements of using such method.
Shortest paths itself can't be loops, so the strategy tries different ways to assemble the shortest paths into loops. It can be a direction connection plus a reverse shortest path, or both shortest path. By doing so, a new concept is introduced as the axis, which is the key node in the graph that you use them as the source or target in shortest path searching. You can have more than 2 axes, of course.
There are other problems like duplication or knots in a loop. Typically, you wouldn't expect that because it means a more basic loop exists, but between some nodes there are only crossed shortest paths connecting them.
Max flow methods
Given the problems the shortest paths can have, we find that shortest paths tend to throw away much of the information in the network, and we need another form of computation. Flow methods come in handy because they balance between the overall graph weights, while introducing other problems. In a flow, the bigger the weight the better the connection can be. The weight can't be summed along any path, but be summed in a node. This requires a very different kind of data than shortest path.
It returns a new graph of flows. You can see the graph as a new distribution of connection, when you want to look into the overall connection from one node to another, which ways are preferred. The downsides are obvious, since flow computation is intensive, you can only pick specific axes to make the flow, and also a flow is a graph instead of an actual loop. No mass screening this time but look more closely at specific sites.
Is it possible to bring these two methods together? Yes, because the flow results can replace the graph to do shortest paths. This way you may get diverse loops with different flow strategies on a single network.
1""" 2This module for now involves 2 loop search algorithms. Because of the great complexity this task can have, such 3loop searching can only be done in a screening way. Thus, they might not be able to give every possible results, 4but it's good to make a combination of different strategies and use multiple modailty of data to make up for this. 5 6# Shortest Path Methods 7One method is based on shortest path that will typically get you a series of loops made up of nodes. The shortest path 8integrates scores on the path so the smaller the score is the better the loop can be. And the scores are actually 9addable. These are the basic requirements of using such method. 10 11Shortest paths itself can't be loops, so the strategy tries different ways to assemble the shortest paths into loops. 12It can be a direction connection plus a reverse shortest path, or both shortest path. By doing so, a new concept is 13introduced as the axis, which is the key node in the graph that you use them as the source or target in shortest path 14searching. You can have more than 2 axes, of course. 15 16There are other problems like duplication or knots in a loop. Typically, you wouldn't expect that because it means 17a more basic loop exists, but between some nodes there are only crossed shortest paths connecting them. 18 19# Max flow methods 20Given the problems the shortest paths can have, we find that shortest paths tend to throw away much of the information 21in the network, and we need another form of computation. Flow methods come in handy because they balance between the overall 22graph weights, while introducing other problems. In a flow, the bigger the weight the better the connection can be. 23The weight can't be summed along any path, but be summed in a node. This requires a very different kind of data than shortest 24path. 25 26It returns a new graph of flows. You can see the graph as a new distribution of connection, when you want to look into 27the overall connection from one node to another, which ways are preferred. The downsides are obvious, since flow computation 28is intensive, you can only pick specific axes to make the flow, and also a flow is a graph instead of an actual loop. 29No mass screening this time but look more closely at specific sites. 30 31Is it possible to bring these two methods together? Yes, because the flow results can replace the graph to do shortest paths. 32This way you may get diverse loops with different flow strategies on a single network. 33""" 34 35 36import typing 37 38import networkx as nx 39import pandas as pd 40import numpy as np 41import itertools 42from collections import OrderedDict 43from tqdm import tqdm 44 45 46class GraphMaintainer: 47 48 def __init__(self): 49 self.graph = nx.DiGraph() 50 51 def add_subgraph(self, adj_mat: pd.DataFrame, def_val: float = -1): 52 """ 53 Add weighted edges from a pandas dataframe as an adjacent matrix. 54 55 Each row/column name is a vertex that can be mapped to the input ontology. 56 57 The graph is directed and the projection is from row to column. 58 59 This function can be called multiple times for different dataframes if you don't have 60 them in a single dataframe, but loading edges with same indices will overwrite the original one. 61 62 :param adj_mat: a pandas dataframe, with unique rows and columns. 63 :param def_val: the default value for non-connecting edges. 64 """ 65 assert adj_mat.index.is_unique, "rows contain duplication" 66 assert adj_mat.columns.is_unique, "columns contain duplication" 67 for fro, row in adj_mat.iterrows(): 68 active = row[row != def_val] 69 self.graph.add_weighted_edges_from([*zip([fro] * len(active), active.index, active)]) 70 71 def weight_transform(self, in_key='weight', out_key='weight', func=lambda x: 1 / x): 72 """ 73 Transform the edge weights. The transform is based on networkx edge data, which is maintained as a dict. 74 The default key is 'weight', which is the default key when the edge is added to the graph and used for both 75 shortest path and flow algorithms' input. So this function replace the weights by default. 76 77 This is useful because different algorithms such as network flow and shortest paths expect edge weights 78 of different meaning, thus leading to different treatment to 'big' and 'small'. Smaller weights in network flow 79 is equivalent to bigger weights in shortest path. 80 81 :param func: transform function. 82 :param in_key: input attribute key. 83 :param out_key: output attribute key. 84 """ 85 w = nx.get_edge_attributes(self.graph, 'weight') 86 for k, v in w.items(): 87 w[k] = {'weight': func(v)} 88 nx.set_edge_attributes(self.graph, w) 89 90 91class MaxFlowLoopSearch(GraphMaintainer): 92 """ 93 Using networkx directed graph for implementing loop search algorithms based on network flow. 94 95 As this method uses network flow, you should make sure the edge weights are meaningful so that 96 a bigger weight means a strong connection. The flow along a path is throttled by the max capacity, 97 and the whole flow is distributed over the edges and determined by the overall capacity. 98 99 With this class, you can get loops in forms of a residual network. 100 There is no definite loop that can form a list, unlike shortest path, but it maximally considers and 101 retains the information of the whole graph while shortest paths may omit lesser good routes. 102 The graph generated can even be applied the shortest path search, if you find it make sense. You can 103 assign the graph member in `ShortestPathLoopSearch` with the result from this one, because they are identical in form. 104 105 Since network flow considers only one pair of source and sink, applying this on multiple pairs 106 can be time-consuming, so the method isn't designed to search through all pairs in the graph. 107 """ 108 109 def single_flow(self, s, t, rm_negative_and_zero=True): 110 """ 111 compute a max flow, on edge attribute 'weight'. Though it uses the networkx algorithm that stores the result 112 in a new attribute 'flow', it will get this value back to 'weight' for less confusion. 113 114 :param s: source 115 :param t: sink 116 :param rm_negative_and_zero: remove the negative and zero flow on all edges, default as True. 117 :return: a directed graph with flow results as its edges. 118 """ 119 flow = nx.flow.preflow_push(self.graph, s, t, 'weight') 120 it = nx.get_edge_attributes(flow, 'flow').items() 121 out = nx.DiGraph() 122 out.add_weighted_edges_from([k + (v,) for k, v in it if not rm_negative_and_zero or v > 0]) 123 return out 124 125 def cycle_flows(self, axes: typing.Iterable, rm_negative_and_zero=True): 126 """ 127 compute multiple max flow along multiple key vertices that are supposed to be on the loop. 128 129 The flow will be computed from the 1st to 2nd, 2nd to 3rd, ..., last to 1st. 130 131 :param axes: the key vertices. 132 :param rm_negative_and_zero: remove the negative and zero flow on all edges, default as True. 133 :return: a dict of all the flow that is identical to that of the `single_flow`. 134 """ 135 axes = list(axes) 136 axes = axes + axes[:1] 137 out = {} 138 for i, j in itertools.pairwise(axes): 139 out[(i, j)] = self.single_flow(i, j, rm_negative_and_zero) 140 return out 141 142 def magnet_flow(self, s, t): 143 """ 144 With a pair of key vertices s and t in a loop, when edge t -> s exists, it calculates the max flow from s -> t. 145 This is similar to a magnet, where the magnetic field outside corresponds to the max flow and the inner field 146 corresponds to the direct edge. 147 148 Networkx will return a residual network with flow[u][v] == -flow[v][u], but we will remove any negative and zero 149 flows and replace the flow t -> s with just edge t -> s. 150 151 This new graph can be seen as a redistribution of connection intensity, and can be used to do shortest path 152 searches since it's identical in form with the input of the algorithm. 153 154 :param s: the source of the max flow 155 :param t: the sink of the max flow 156 :return: a flow graph with 'weight' as its edge. 157 """ 158 assert self.graph.has_edge(t, s), "the edge from t to s doesn't exist, loop can't be formed" 159 flow = self.single_flow(s, t, rm_negative_and_zero=True) 160 flow.add_edge(t, s, weight=self.graph[t][s]['weight']) 161 return flow 162 163 def merged_cycle_flow(self, axes: typing.Iterable, merge_func=np.sum): 164 """ 165 compute multiple max flows along a cycle of key vertices, and merge the flows into one graph. 166 167 The max flows used are rid of negative and zero flow weights. It's still possible that a single edge may encounter 168 multiple positive flows, meaning some paths can be reused between different key vertices. Then it's necessary 169 to determine how they are merged. 170 171 :param axes: the key vertices. 172 :param merge_func: how to merge multiple weights on a single edge. By default, they are summed. 173 :return: a flow graph with 'weight' as its edge. 174 """ 175 axes = list(axes) 176 sg = nx.DiGraph() 177 for k, v in self.cycle_flows(axes, rm_negative_and_zero=True).items(): 178 sg.add_weighted_edges_from([(u, v, d['weight']) for u, v, d in v.edges(data=True)], k) 179 out = nx.DiGraph() 180 out.add_weighted_edges_from([(u, v, merge_func(list(d.values()))) for u, v, d in sg.edges(data=True)]) 181 return out 182 183 184class ShortestPathLoopSearch(GraphMaintainer): 185 """ 186 Using networkx directed graph for implementing loop search algorithms based on shortest paths. 187 188 As this method uses shortest paths, you should make sure the edge weights are meaningful so that a 189 smaller weight means a stronger connection. The shortest path aggregates the weights along the path, so that 190 summation is also meaningful. 191 192 With this class, you can get loops in forms of exact lists of nodes along the loop. 193 194 The shortest path algorithm uses the 'weight' attribute in the networkx graph. 195 """ 196 197 def __init__(self): 198 super().__init__() 199 self.sssp = None 200 201 def init(self): 202 # bellman ford 203 self.sssp = {} 204 for k, v in tqdm(nx.all_pairs_bellman_ford_path(self.graph), total=self.graph.number_of_nodes()): 205 self.sssp[k] = v 206 207 def chain_screen(self, n_axis=2, top: int = None, must_include: typing.Iterable = None, allow_knots=False, 208 axis_pool: typing.Iterable = None, axis_must_include: typing.Iterable = None, priority=np.sum): 209 """ 210 Using connected single source shortest paths to find loops. You should ensure that in your graph smaller edge 211 weights means stronger connection for this method. 212 213 First, use Bellman Ford to find the shortest paths between all regions. Then loops can be found by 214 pairing any shortest paths like A -> B and B -> A. You can also find loops with more anchors' shortest path like A -> B, 215 B -> C, C -> A, and even more. But with more anchors, the computation will be greatly slowed down, and number of 216 discovery will be greatly expanded. 217 218 When the search number is very great, you can use `top` to have an early stop. The shortest path are sorted such that a 219 loop with minimal shortest path element will be retrieved first, but it won't guarantee the final loop will be ordered 220 ascending. Besides, you can also set the must included nodes in the loop so that those without are skipped. 221 222 The method attempts to located unique loops by the axes, i.e. one set of axes corresponding to one loop. 223 It's likely that multiple axes can be connected in multiple ways, in which case once finding a shorter one a 224 replacement will occur. The purpose is for balance between the uniqueness and computation. 225 Since the shortest path will also be returned, you can check other connection ways easily. 226 227 This can fail to give a loop between a set of axis, because there may be overlap between their shortest path, thus giving 228 rise to smaller loops, whereas there may exist an exact loop that contain non shortest paths but unique vertices. 229 Therefore, the shortest path from A to B here is better off interpreted as the most preferable path from A to B, 230 which means other paths from A to B are not considered. 231 232 :param n_axis: the number to consider for connecting shortest path into a loop. Larger n_axis will significantly 233 increase the computation 234 :param top: the most prominent results. This can speed up as it searches from the most probable loops and can 235 have an early stop. Default as None to turn off. 236 :param axis_pool: a list of vertices for axes to choose from, which can largely narrow down the search space. 237 Default as None to turn off. 238 :param must_include: the vertices that must appear in the search result. 239 :param axis_must_include: the vertices that must appear in axes, and must be subset of `axis_pool`. This check 240 is independent of `must_include`. 241 :param allow_knots: whether to allow duplicate nodes (but not duplicated with axes). 242 :param priority: the edge weight aggregation function for shortest path sorting. Default as sum, you can also use, for 243 example, mean to make up for shortest path that travel through many vertices. 244 :return: a dict of axis mapped to loop in the form of loops[(a,b,c)] -> [[a, ..., b], [b, ..., c], [c, ..., a]], 245 where the key (a,b,c) is a sorted tuple of axes, doesn't necessarily mean a -> b -> c -> a, 246 and shortest path in the form of sssp[from_node][to_node] -> [from_node, ..., to_node] 247 """ 248 if must_include is not None: 249 must_include = set(must_include) 250 if axis_must_include is not None: 251 axis_must_include = set(axis_must_include) 252 assert len(axis_must_include) <= n_axis, "number of must included axis is more than specified" 253 if axis_pool is not None: 254 axis_pool = set(axis_pool) 255 assert len(axis_pool) >= n_axis, "size of axis pool is smaller than specified" 256 if axis_must_include is not None: 257 assert axis_must_include.issubset(axis_pool), "must included axis should be subset of axis pool" 258 259 if self.sssp is None: 260 self.init() 261 262 # do an edge sorting beforehand, ascending 263 # vertices not in axis pool (if not None) will be omitted 264 connection = [] 265 for k1, v1 in self.sssp.items(): 266 if axis_pool is not None and k1 not in axis_pool: 267 continue 268 for k2, v2 in v1.items(): 269 if axis_pool is not None and k2 not in axis_pool: 270 continue 271 if k1 == k2: 272 continue 273 w = [] 274 for i, j in itertools.pairwise(v2): 275 w.append(self.graph[i][j]['weight']) 276 connection.append([k1, k2, priority(w)]) 277 connection.sort(key=lambda x: x[2]) 278 sorted_sssp = OrderedDict() 279 # sorted_sssp[k1][k2] -> (length, list of vertices) 280 # and k2 mapping is ordered 281 for i, (k1, k2, w) in enumerate(connection): 282 if k1 not in sorted_sssp: 283 sorted_sssp[k1] = OrderedDict() 284 sorted_sssp[k1][k2] = (w, self.sssp[k1][k2]) 285 286 max_iter = [0 if top is None else top] 287 loops = {} 288 289 def dfs_loop_search(loop: list, visited: set, length: float): 290 # finishing, the last part of the cycle, 291 if len(loop) == n_axis - 1: 292 # check the existence of the last part 293 k1 = loop[-1][-1] 294 k2 = loop[0][0] 295 if k1 not in sorted_sssp or k2 not in sorted_sssp[k1]: 296 yield 297 last_part = sorted_sssp[k1][k2][1] 298 299 # check knot 300 sl = set(last_part[1:-1]) 301 new_visited = visited | sl 302 if not allow_knots and len(new_visited) != len(visited) + len(last_part) - 2 or \ 303 not sl.isdisjoint(i[0] for i in loop): 304 yield 305 loop.append(last_part) 306 length += sorted_sssp[k1][k2][0] 307 axis = tuple(sorted(i[0] for i in loop)) 308 309 if (must_include is None or must_include.issubset(new_visited)) and \ 310 (axis_must_include is None or axis_must_include.issubset(axis)): 311 if axis in loops: 312 # check if smaller than existed 313 if loops[axis]['length'] < length: 314 yield 315 else: 316 max_iter[0] += 1 317 max_iter[0] -= 1 318 loops[axis] = { 319 'loop': loop, 320 'length': length, 321 } 322 # start 323 elif len(loop) == 0: 324 for k1, k2, w in connection: 325 v = sorted_sssp[k1][k2][1] 326 yield dfs_loop_search([v], set(v), w) 327 else: 328 k1 = loop[-1][-1] # last axis 329 for k, (w, c) in sorted_sssp[k1].items(): 330 sc = set(c[1:]) 331 new_visited = visited | sc 332 if len(new_visited) == len(visited) + len(c) - 1 or \ 333 allow_knots and sc.isdisjoint(i[0] for i in loop): 334 yield dfs_loop_search(loop + [c], new_visited, length + w) 335 yield 336 337 gen = dfs_loop_search([], set(), 0) 338 stk = [] 339 while top is None or max_iter[0] > 0: 340 if isinstance(gen, typing.Generator): 341 stk.append(gen) 342 gen = next(gen) 343 else: 344 stk.pop() 345 if not stk: 346 break 347 gen = stk[-1].send(gen) 348 349 return loops 350 351 def pair_complement(self, axis_pool: typing.Iterable = None): 352 """ 353 Given a pair of vertices, the loop is defined as the direct connection between them plus the inverted shortest path. 354 For example, with A and B, the loop can be either edge AB + the shortest path from B to A 355 or edge BA + the shortest path from A to B. This comes in handy when shortest paths from A to B and B to A have 356 intersection, leading to knots in a loop. 357 358 This can also be much faster and doesn't need speedup or sorting. 359 360 :param axis_pool: a list of vertices for axes to choose from, which can largely narrow down the search space. 361 Default as None to turn off, and it will try every pair of vertices as long as there's a direct edge. 362 :return: a dict of loops, keys are the direct edges and values are the corresponding inverse shortest path. 363 """ 364 if axis_pool is not None: 365 axis_pool = set(axis_pool) 366 if self.sssp is None: 367 self.init() 368 out = {} 369 for fro, to in self.graph.edges: 370 if axis_pool is not None and (fro not in axis_pool or to not in axis_pool) or to == fro: 371 continue 372 if to in self.sssp and fro in self.sssp[to]: 373 out[(fro, to)] = self.sssp[to][fro] 374 return out
47class GraphMaintainer: 48 49 def __init__(self): 50 self.graph = nx.DiGraph() 51 52 def add_subgraph(self, adj_mat: pd.DataFrame, def_val: float = -1): 53 """ 54 Add weighted edges from a pandas dataframe as an adjacent matrix. 55 56 Each row/column name is a vertex that can be mapped to the input ontology. 57 58 The graph is directed and the projection is from row to column. 59 60 This function can be called multiple times for different dataframes if you don't have 61 them in a single dataframe, but loading edges with same indices will overwrite the original one. 62 63 :param adj_mat: a pandas dataframe, with unique rows and columns. 64 :param def_val: the default value for non-connecting edges. 65 """ 66 assert adj_mat.index.is_unique, "rows contain duplication" 67 assert adj_mat.columns.is_unique, "columns contain duplication" 68 for fro, row in adj_mat.iterrows(): 69 active = row[row != def_val] 70 self.graph.add_weighted_edges_from([*zip([fro] * len(active), active.index, active)]) 71 72 def weight_transform(self, in_key='weight', out_key='weight', func=lambda x: 1 / x): 73 """ 74 Transform the edge weights. The transform is based on networkx edge data, which is maintained as a dict. 75 The default key is 'weight', which is the default key when the edge is added to the graph and used for both 76 shortest path and flow algorithms' input. So this function replace the weights by default. 77 78 This is useful because different algorithms such as network flow and shortest paths expect edge weights 79 of different meaning, thus leading to different treatment to 'big' and 'small'. Smaller weights in network flow 80 is equivalent to bigger weights in shortest path. 81 82 :param func: transform function. 83 :param in_key: input attribute key. 84 :param out_key: output attribute key. 85 """ 86 w = nx.get_edge_attributes(self.graph, 'weight') 87 for k, v in w.items(): 88 w[k] = {'weight': func(v)} 89 nx.set_edge_attributes(self.graph, w)
52 def add_subgraph(self, adj_mat: pd.DataFrame, def_val: float = -1): 53 """ 54 Add weighted edges from a pandas dataframe as an adjacent matrix. 55 56 Each row/column name is a vertex that can be mapped to the input ontology. 57 58 The graph is directed and the projection is from row to column. 59 60 This function can be called multiple times for different dataframes if you don't have 61 them in a single dataframe, but loading edges with same indices will overwrite the original one. 62 63 :param adj_mat: a pandas dataframe, with unique rows and columns. 64 :param def_val: the default value for non-connecting edges. 65 """ 66 assert adj_mat.index.is_unique, "rows contain duplication" 67 assert adj_mat.columns.is_unique, "columns contain duplication" 68 for fro, row in adj_mat.iterrows(): 69 active = row[row != def_val] 70 self.graph.add_weighted_edges_from([*zip([fro] * len(active), active.index, active)])
Add weighted edges from a pandas dataframe as an adjacent matrix.
Each row/column name is a vertex that can be mapped to the input ontology.
The graph is directed and the projection is from row to column.
This function can be called multiple times for different dataframes if you don't have them in a single dataframe, but loading edges with same indices will overwrite the original one.
Parameters
- adj_mat: a pandas dataframe, with unique rows and columns.
- def_val: the default value for non-connecting edges.
72 def weight_transform(self, in_key='weight', out_key='weight', func=lambda x: 1 / x): 73 """ 74 Transform the edge weights. The transform is based on networkx edge data, which is maintained as a dict. 75 The default key is 'weight', which is the default key when the edge is added to the graph and used for both 76 shortest path and flow algorithms' input. So this function replace the weights by default. 77 78 This is useful because different algorithms such as network flow and shortest paths expect edge weights 79 of different meaning, thus leading to different treatment to 'big' and 'small'. Smaller weights in network flow 80 is equivalent to bigger weights in shortest path. 81 82 :param func: transform function. 83 :param in_key: input attribute key. 84 :param out_key: output attribute key. 85 """ 86 w = nx.get_edge_attributes(self.graph, 'weight') 87 for k, v in w.items(): 88 w[k] = {'weight': func(v)} 89 nx.set_edge_attributes(self.graph, w)
Transform the edge weights. The transform is based on networkx edge data, which is maintained as a dict. The default key is 'weight', which is the default key when the edge is added to the graph and used for both shortest path and flow algorithms' input. So this function replace the weights by default.
This is useful because different algorithms such as network flow and shortest paths expect edge weights of different meaning, thus leading to different treatment to 'big' and 'small'. Smaller weights in network flow is equivalent to bigger weights in shortest path.
Parameters
- func: transform function.
- in_key: input attribute key.
- out_key: output attribute key.
92class MaxFlowLoopSearch(GraphMaintainer): 93 """ 94 Using networkx directed graph for implementing loop search algorithms based on network flow. 95 96 As this method uses network flow, you should make sure the edge weights are meaningful so that 97 a bigger weight means a strong connection. The flow along a path is throttled by the max capacity, 98 and the whole flow is distributed over the edges and determined by the overall capacity. 99 100 With this class, you can get loops in forms of a residual network. 101 There is no definite loop that can form a list, unlike shortest path, but it maximally considers and 102 retains the information of the whole graph while shortest paths may omit lesser good routes. 103 The graph generated can even be applied the shortest path search, if you find it make sense. You can 104 assign the graph member in `ShortestPathLoopSearch` with the result from this one, because they are identical in form. 105 106 Since network flow considers only one pair of source and sink, applying this on multiple pairs 107 can be time-consuming, so the method isn't designed to search through all pairs in the graph. 108 """ 109 110 def single_flow(self, s, t, rm_negative_and_zero=True): 111 """ 112 compute a max flow, on edge attribute 'weight'. Though it uses the networkx algorithm that stores the result 113 in a new attribute 'flow', it will get this value back to 'weight' for less confusion. 114 115 :param s: source 116 :param t: sink 117 :param rm_negative_and_zero: remove the negative and zero flow on all edges, default as True. 118 :return: a directed graph with flow results as its edges. 119 """ 120 flow = nx.flow.preflow_push(self.graph, s, t, 'weight') 121 it = nx.get_edge_attributes(flow, 'flow').items() 122 out = nx.DiGraph() 123 out.add_weighted_edges_from([k + (v,) for k, v in it if not rm_negative_and_zero or v > 0]) 124 return out 125 126 def cycle_flows(self, axes: typing.Iterable, rm_negative_and_zero=True): 127 """ 128 compute multiple max flow along multiple key vertices that are supposed to be on the loop. 129 130 The flow will be computed from the 1st to 2nd, 2nd to 3rd, ..., last to 1st. 131 132 :param axes: the key vertices. 133 :param rm_negative_and_zero: remove the negative and zero flow on all edges, default as True. 134 :return: a dict of all the flow that is identical to that of the `single_flow`. 135 """ 136 axes = list(axes) 137 axes = axes + axes[:1] 138 out = {} 139 for i, j in itertools.pairwise(axes): 140 out[(i, j)] = self.single_flow(i, j, rm_negative_and_zero) 141 return out 142 143 def magnet_flow(self, s, t): 144 """ 145 With a pair of key vertices s and t in a loop, when edge t -> s exists, it calculates the max flow from s -> t. 146 This is similar to a magnet, where the magnetic field outside corresponds to the max flow and the inner field 147 corresponds to the direct edge. 148 149 Networkx will return a residual network with flow[u][v] == -flow[v][u], but we will remove any negative and zero 150 flows and replace the flow t -> s with just edge t -> s. 151 152 This new graph can be seen as a redistribution of connection intensity, and can be used to do shortest path 153 searches since it's identical in form with the input of the algorithm. 154 155 :param s: the source of the max flow 156 :param t: the sink of the max flow 157 :return: a flow graph with 'weight' as its edge. 158 """ 159 assert self.graph.has_edge(t, s), "the edge from t to s doesn't exist, loop can't be formed" 160 flow = self.single_flow(s, t, rm_negative_and_zero=True) 161 flow.add_edge(t, s, weight=self.graph[t][s]['weight']) 162 return flow 163 164 def merged_cycle_flow(self, axes: typing.Iterable, merge_func=np.sum): 165 """ 166 compute multiple max flows along a cycle of key vertices, and merge the flows into one graph. 167 168 The max flows used are rid of negative and zero flow weights. It's still possible that a single edge may encounter 169 multiple positive flows, meaning some paths can be reused between different key vertices. Then it's necessary 170 to determine how they are merged. 171 172 :param axes: the key vertices. 173 :param merge_func: how to merge multiple weights on a single edge. By default, they are summed. 174 :return: a flow graph with 'weight' as its edge. 175 """ 176 axes = list(axes) 177 sg = nx.DiGraph() 178 for k, v in self.cycle_flows(axes, rm_negative_and_zero=True).items(): 179 sg.add_weighted_edges_from([(u, v, d['weight']) for u, v, d in v.edges(data=True)], k) 180 out = nx.DiGraph() 181 out.add_weighted_edges_from([(u, v, merge_func(list(d.values()))) for u, v, d in sg.edges(data=True)]) 182 return out
Using networkx directed graph for implementing loop search algorithms based on network flow.
As this method uses network flow, you should make sure the edge weights are meaningful so that a bigger weight means a strong connection. The flow along a path is throttled by the max capacity, and the whole flow is distributed over the edges and determined by the overall capacity.
With this class, you can get loops in forms of a residual network.
There is no definite loop that can form a list, unlike shortest path, but it maximally considers and
retains the information of the whole graph while shortest paths may omit lesser good routes.
The graph generated can even be applied the shortest path search, if you find it make sense. You can
assign the graph member in ShortestPathLoopSearch
with the result from this one, because they are identical in form.
Since network flow considers only one pair of source and sink, applying this on multiple pairs can be time-consuming, so the method isn't designed to search through all pairs in the graph.
110 def single_flow(self, s, t, rm_negative_and_zero=True): 111 """ 112 compute a max flow, on edge attribute 'weight'. Though it uses the networkx algorithm that stores the result 113 in a new attribute 'flow', it will get this value back to 'weight' for less confusion. 114 115 :param s: source 116 :param t: sink 117 :param rm_negative_and_zero: remove the negative and zero flow on all edges, default as True. 118 :return: a directed graph with flow results as its edges. 119 """ 120 flow = nx.flow.preflow_push(self.graph, s, t, 'weight') 121 it = nx.get_edge_attributes(flow, 'flow').items() 122 out = nx.DiGraph() 123 out.add_weighted_edges_from([k + (v,) for k, v in it if not rm_negative_and_zero or v > 0]) 124 return out
compute a max flow, on edge attribute 'weight'. Though it uses the networkx algorithm that stores the result in a new attribute 'flow', it will get this value back to 'weight' for less confusion.
Parameters
- s: source
- t: sink
- rm_negative_and_zero: remove the negative and zero flow on all edges, default as True.
Returns
a directed graph with flow results as its edges.
126 def cycle_flows(self, axes: typing.Iterable, rm_negative_and_zero=True): 127 """ 128 compute multiple max flow along multiple key vertices that are supposed to be on the loop. 129 130 The flow will be computed from the 1st to 2nd, 2nd to 3rd, ..., last to 1st. 131 132 :param axes: the key vertices. 133 :param rm_negative_and_zero: remove the negative and zero flow on all edges, default as True. 134 :return: a dict of all the flow that is identical to that of the `single_flow`. 135 """ 136 axes = list(axes) 137 axes = axes + axes[:1] 138 out = {} 139 for i, j in itertools.pairwise(axes): 140 out[(i, j)] = self.single_flow(i, j, rm_negative_and_zero) 141 return out
compute multiple max flow along multiple key vertices that are supposed to be on the loop.
The flow will be computed from the 1st to 2nd, 2nd to 3rd, ..., last to 1st.
Parameters
- axes: the key vertices.
- rm_negative_and_zero: remove the negative and zero flow on all edges, default as True.
Returns
a dict of all the flow that is identical to that of the
single_flow
.
143 def magnet_flow(self, s, t): 144 """ 145 With a pair of key vertices s and t in a loop, when edge t -> s exists, it calculates the max flow from s -> t. 146 This is similar to a magnet, where the magnetic field outside corresponds to the max flow and the inner field 147 corresponds to the direct edge. 148 149 Networkx will return a residual network with flow[u][v] == -flow[v][u], but we will remove any negative and zero 150 flows and replace the flow t -> s with just edge t -> s. 151 152 This new graph can be seen as a redistribution of connection intensity, and can be used to do shortest path 153 searches since it's identical in form with the input of the algorithm. 154 155 :param s: the source of the max flow 156 :param t: the sink of the max flow 157 :return: a flow graph with 'weight' as its edge. 158 """ 159 assert self.graph.has_edge(t, s), "the edge from t to s doesn't exist, loop can't be formed" 160 flow = self.single_flow(s, t, rm_negative_and_zero=True) 161 flow.add_edge(t, s, weight=self.graph[t][s]['weight']) 162 return flow
With a pair of key vertices s and t in a loop, when edge t -> s exists, it calculates the max flow from s -> t. This is similar to a magnet, where the magnetic field outside corresponds to the max flow and the inner field corresponds to the direct edge.
Networkx will return a residual network with flow[u][v] == -flow[v][u], but we will remove any negative and zero flows and replace the flow t -> s with just edge t -> s.
This new graph can be seen as a redistribution of connection intensity, and can be used to do shortest path searches since it's identical in form with the input of the algorithm.
Parameters
- s: the source of the max flow
- t: the sink of the max flow
Returns
a flow graph with 'weight' as its edge.
164 def merged_cycle_flow(self, axes: typing.Iterable, merge_func=np.sum): 165 """ 166 compute multiple max flows along a cycle of key vertices, and merge the flows into one graph. 167 168 The max flows used are rid of negative and zero flow weights. It's still possible that a single edge may encounter 169 multiple positive flows, meaning some paths can be reused between different key vertices. Then it's necessary 170 to determine how they are merged. 171 172 :param axes: the key vertices. 173 :param merge_func: how to merge multiple weights on a single edge. By default, they are summed. 174 :return: a flow graph with 'weight' as its edge. 175 """ 176 axes = list(axes) 177 sg = nx.DiGraph() 178 for k, v in self.cycle_flows(axes, rm_negative_and_zero=True).items(): 179 sg.add_weighted_edges_from([(u, v, d['weight']) for u, v, d in v.edges(data=True)], k) 180 out = nx.DiGraph() 181 out.add_weighted_edges_from([(u, v, merge_func(list(d.values()))) for u, v, d in sg.edges(data=True)]) 182 return out
compute multiple max flows along a cycle of key vertices, and merge the flows into one graph.
The max flows used are rid of negative and zero flow weights. It's still possible that a single edge may encounter multiple positive flows, meaning some paths can be reused between different key vertices. Then it's necessary to determine how they are merged.
Parameters
- axes: the key vertices.
- merge_func: how to merge multiple weights on a single edge. By default, they are summed.
Returns
a flow graph with 'weight' as its edge.
Inherited Members
185class ShortestPathLoopSearch(GraphMaintainer): 186 """ 187 Using networkx directed graph for implementing loop search algorithms based on shortest paths. 188 189 As this method uses shortest paths, you should make sure the edge weights are meaningful so that a 190 smaller weight means a stronger connection. The shortest path aggregates the weights along the path, so that 191 summation is also meaningful. 192 193 With this class, you can get loops in forms of exact lists of nodes along the loop. 194 195 The shortest path algorithm uses the 'weight' attribute in the networkx graph. 196 """ 197 198 def __init__(self): 199 super().__init__() 200 self.sssp = None 201 202 def init(self): 203 # bellman ford 204 self.sssp = {} 205 for k, v in tqdm(nx.all_pairs_bellman_ford_path(self.graph), total=self.graph.number_of_nodes()): 206 self.sssp[k] = v 207 208 def chain_screen(self, n_axis=2, top: int = None, must_include: typing.Iterable = None, allow_knots=False, 209 axis_pool: typing.Iterable = None, axis_must_include: typing.Iterable = None, priority=np.sum): 210 """ 211 Using connected single source shortest paths to find loops. You should ensure that in your graph smaller edge 212 weights means stronger connection for this method. 213 214 First, use Bellman Ford to find the shortest paths between all regions. Then loops can be found by 215 pairing any shortest paths like A -> B and B -> A. You can also find loops with more anchors' shortest path like A -> B, 216 B -> C, C -> A, and even more. But with more anchors, the computation will be greatly slowed down, and number of 217 discovery will be greatly expanded. 218 219 When the search number is very great, you can use `top` to have an early stop. The shortest path are sorted such that a 220 loop with minimal shortest path element will be retrieved first, but it won't guarantee the final loop will be ordered 221 ascending. Besides, you can also set the must included nodes in the loop so that those without are skipped. 222 223 The method attempts to located unique loops by the axes, i.e. one set of axes corresponding to one loop. 224 It's likely that multiple axes can be connected in multiple ways, in which case once finding a shorter one a 225 replacement will occur. The purpose is for balance between the uniqueness and computation. 226 Since the shortest path will also be returned, you can check other connection ways easily. 227 228 This can fail to give a loop between a set of axis, because there may be overlap between their shortest path, thus giving 229 rise to smaller loops, whereas there may exist an exact loop that contain non shortest paths but unique vertices. 230 Therefore, the shortest path from A to B here is better off interpreted as the most preferable path from A to B, 231 which means other paths from A to B are not considered. 232 233 :param n_axis: the number to consider for connecting shortest path into a loop. Larger n_axis will significantly 234 increase the computation 235 :param top: the most prominent results. This can speed up as it searches from the most probable loops and can 236 have an early stop. Default as None to turn off. 237 :param axis_pool: a list of vertices for axes to choose from, which can largely narrow down the search space. 238 Default as None to turn off. 239 :param must_include: the vertices that must appear in the search result. 240 :param axis_must_include: the vertices that must appear in axes, and must be subset of `axis_pool`. This check 241 is independent of `must_include`. 242 :param allow_knots: whether to allow duplicate nodes (but not duplicated with axes). 243 :param priority: the edge weight aggregation function for shortest path sorting. Default as sum, you can also use, for 244 example, mean to make up for shortest path that travel through many vertices. 245 :return: a dict of axis mapped to loop in the form of loops[(a,b,c)] -> [[a, ..., b], [b, ..., c], [c, ..., a]], 246 where the key (a,b,c) is a sorted tuple of axes, doesn't necessarily mean a -> b -> c -> a, 247 and shortest path in the form of sssp[from_node][to_node] -> [from_node, ..., to_node] 248 """ 249 if must_include is not None: 250 must_include = set(must_include) 251 if axis_must_include is not None: 252 axis_must_include = set(axis_must_include) 253 assert len(axis_must_include) <= n_axis, "number of must included axis is more than specified" 254 if axis_pool is not None: 255 axis_pool = set(axis_pool) 256 assert len(axis_pool) >= n_axis, "size of axis pool is smaller than specified" 257 if axis_must_include is not None: 258 assert axis_must_include.issubset(axis_pool), "must included axis should be subset of axis pool" 259 260 if self.sssp is None: 261 self.init() 262 263 # do an edge sorting beforehand, ascending 264 # vertices not in axis pool (if not None) will be omitted 265 connection = [] 266 for k1, v1 in self.sssp.items(): 267 if axis_pool is not None and k1 not in axis_pool: 268 continue 269 for k2, v2 in v1.items(): 270 if axis_pool is not None and k2 not in axis_pool: 271 continue 272 if k1 == k2: 273 continue 274 w = [] 275 for i, j in itertools.pairwise(v2): 276 w.append(self.graph[i][j]['weight']) 277 connection.append([k1, k2, priority(w)]) 278 connection.sort(key=lambda x: x[2]) 279 sorted_sssp = OrderedDict() 280 # sorted_sssp[k1][k2] -> (length, list of vertices) 281 # and k2 mapping is ordered 282 for i, (k1, k2, w) in enumerate(connection): 283 if k1 not in sorted_sssp: 284 sorted_sssp[k1] = OrderedDict() 285 sorted_sssp[k1][k2] = (w, self.sssp[k1][k2]) 286 287 max_iter = [0 if top is None else top] 288 loops = {} 289 290 def dfs_loop_search(loop: list, visited: set, length: float): 291 # finishing, the last part of the cycle, 292 if len(loop) == n_axis - 1: 293 # check the existence of the last part 294 k1 = loop[-1][-1] 295 k2 = loop[0][0] 296 if k1 not in sorted_sssp or k2 not in sorted_sssp[k1]: 297 yield 298 last_part = sorted_sssp[k1][k2][1] 299 300 # check knot 301 sl = set(last_part[1:-1]) 302 new_visited = visited | sl 303 if not allow_knots and len(new_visited) != len(visited) + len(last_part) - 2 or \ 304 not sl.isdisjoint(i[0] for i in loop): 305 yield 306 loop.append(last_part) 307 length += sorted_sssp[k1][k2][0] 308 axis = tuple(sorted(i[0] for i in loop)) 309 310 if (must_include is None or must_include.issubset(new_visited)) and \ 311 (axis_must_include is None or axis_must_include.issubset(axis)): 312 if axis in loops: 313 # check if smaller than existed 314 if loops[axis]['length'] < length: 315 yield 316 else: 317 max_iter[0] += 1 318 max_iter[0] -= 1 319 loops[axis] = { 320 'loop': loop, 321 'length': length, 322 } 323 # start 324 elif len(loop) == 0: 325 for k1, k2, w in connection: 326 v = sorted_sssp[k1][k2][1] 327 yield dfs_loop_search([v], set(v), w) 328 else: 329 k1 = loop[-1][-1] # last axis 330 for k, (w, c) in sorted_sssp[k1].items(): 331 sc = set(c[1:]) 332 new_visited = visited | sc 333 if len(new_visited) == len(visited) + len(c) - 1 or \ 334 allow_knots and sc.isdisjoint(i[0] for i in loop): 335 yield dfs_loop_search(loop + [c], new_visited, length + w) 336 yield 337 338 gen = dfs_loop_search([], set(), 0) 339 stk = [] 340 while top is None or max_iter[0] > 0: 341 if isinstance(gen, typing.Generator): 342 stk.append(gen) 343 gen = next(gen) 344 else: 345 stk.pop() 346 if not stk: 347 break 348 gen = stk[-1].send(gen) 349 350 return loops 351 352 def pair_complement(self, axis_pool: typing.Iterable = None): 353 """ 354 Given a pair of vertices, the loop is defined as the direct connection between them plus the inverted shortest path. 355 For example, with A and B, the loop can be either edge AB + the shortest path from B to A 356 or edge BA + the shortest path from A to B. This comes in handy when shortest paths from A to B and B to A have 357 intersection, leading to knots in a loop. 358 359 This can also be much faster and doesn't need speedup or sorting. 360 361 :param axis_pool: a list of vertices for axes to choose from, which can largely narrow down the search space. 362 Default as None to turn off, and it will try every pair of vertices as long as there's a direct edge. 363 :return: a dict of loops, keys are the direct edges and values are the corresponding inverse shortest path. 364 """ 365 if axis_pool is not None: 366 axis_pool = set(axis_pool) 367 if self.sssp is None: 368 self.init() 369 out = {} 370 for fro, to in self.graph.edges: 371 if axis_pool is not None and (fro not in axis_pool or to not in axis_pool) or to == fro: 372 continue 373 if to in self.sssp and fro in self.sssp[to]: 374 out[(fro, to)] = self.sssp[to][fro] 375 return out
Using networkx directed graph for implementing loop search algorithms based on shortest paths.
As this method uses shortest paths, you should make sure the edge weights are meaningful so that a smaller weight means a stronger connection. The shortest path aggregates the weights along the path, so that summation is also meaningful.
With this class, you can get loops in forms of exact lists of nodes along the loop.
The shortest path algorithm uses the 'weight' attribute in the networkx graph.
208 def chain_screen(self, n_axis=2, top: int = None, must_include: typing.Iterable = None, allow_knots=False, 209 axis_pool: typing.Iterable = None, axis_must_include: typing.Iterable = None, priority=np.sum): 210 """ 211 Using connected single source shortest paths to find loops. You should ensure that in your graph smaller edge 212 weights means stronger connection for this method. 213 214 First, use Bellman Ford to find the shortest paths between all regions. Then loops can be found by 215 pairing any shortest paths like A -> B and B -> A. You can also find loops with more anchors' shortest path like A -> B, 216 B -> C, C -> A, and even more. But with more anchors, the computation will be greatly slowed down, and number of 217 discovery will be greatly expanded. 218 219 When the search number is very great, you can use `top` to have an early stop. The shortest path are sorted such that a 220 loop with minimal shortest path element will be retrieved first, but it won't guarantee the final loop will be ordered 221 ascending. Besides, you can also set the must included nodes in the loop so that those without are skipped. 222 223 The method attempts to located unique loops by the axes, i.e. one set of axes corresponding to one loop. 224 It's likely that multiple axes can be connected in multiple ways, in which case once finding a shorter one a 225 replacement will occur. The purpose is for balance between the uniqueness and computation. 226 Since the shortest path will also be returned, you can check other connection ways easily. 227 228 This can fail to give a loop between a set of axis, because there may be overlap between their shortest path, thus giving 229 rise to smaller loops, whereas there may exist an exact loop that contain non shortest paths but unique vertices. 230 Therefore, the shortest path from A to B here is better off interpreted as the most preferable path from A to B, 231 which means other paths from A to B are not considered. 232 233 :param n_axis: the number to consider for connecting shortest path into a loop. Larger n_axis will significantly 234 increase the computation 235 :param top: the most prominent results. This can speed up as it searches from the most probable loops and can 236 have an early stop. Default as None to turn off. 237 :param axis_pool: a list of vertices for axes to choose from, which can largely narrow down the search space. 238 Default as None to turn off. 239 :param must_include: the vertices that must appear in the search result. 240 :param axis_must_include: the vertices that must appear in axes, and must be subset of `axis_pool`. This check 241 is independent of `must_include`. 242 :param allow_knots: whether to allow duplicate nodes (but not duplicated with axes). 243 :param priority: the edge weight aggregation function for shortest path sorting. Default as sum, you can also use, for 244 example, mean to make up for shortest path that travel through many vertices. 245 :return: a dict of axis mapped to loop in the form of loops[(a,b,c)] -> [[a, ..., b], [b, ..., c], [c, ..., a]], 246 where the key (a,b,c) is a sorted tuple of axes, doesn't necessarily mean a -> b -> c -> a, 247 and shortest path in the form of sssp[from_node][to_node] -> [from_node, ..., to_node] 248 """ 249 if must_include is not None: 250 must_include = set(must_include) 251 if axis_must_include is not None: 252 axis_must_include = set(axis_must_include) 253 assert len(axis_must_include) <= n_axis, "number of must included axis is more than specified" 254 if axis_pool is not None: 255 axis_pool = set(axis_pool) 256 assert len(axis_pool) >= n_axis, "size of axis pool is smaller than specified" 257 if axis_must_include is not None: 258 assert axis_must_include.issubset(axis_pool), "must included axis should be subset of axis pool" 259 260 if self.sssp is None: 261 self.init() 262 263 # do an edge sorting beforehand, ascending 264 # vertices not in axis pool (if not None) will be omitted 265 connection = [] 266 for k1, v1 in self.sssp.items(): 267 if axis_pool is not None and k1 not in axis_pool: 268 continue 269 for k2, v2 in v1.items(): 270 if axis_pool is not None and k2 not in axis_pool: 271 continue 272 if k1 == k2: 273 continue 274 w = [] 275 for i, j in itertools.pairwise(v2): 276 w.append(self.graph[i][j]['weight']) 277 connection.append([k1, k2, priority(w)]) 278 connection.sort(key=lambda x: x[2]) 279 sorted_sssp = OrderedDict() 280 # sorted_sssp[k1][k2] -> (length, list of vertices) 281 # and k2 mapping is ordered 282 for i, (k1, k2, w) in enumerate(connection): 283 if k1 not in sorted_sssp: 284 sorted_sssp[k1] = OrderedDict() 285 sorted_sssp[k1][k2] = (w, self.sssp[k1][k2]) 286 287 max_iter = [0 if top is None else top] 288 loops = {} 289 290 def dfs_loop_search(loop: list, visited: set, length: float): 291 # finishing, the last part of the cycle, 292 if len(loop) == n_axis - 1: 293 # check the existence of the last part 294 k1 = loop[-1][-1] 295 k2 = loop[0][0] 296 if k1 not in sorted_sssp or k2 not in sorted_sssp[k1]: 297 yield 298 last_part = sorted_sssp[k1][k2][1] 299 300 # check knot 301 sl = set(last_part[1:-1]) 302 new_visited = visited | sl 303 if not allow_knots and len(new_visited) != len(visited) + len(last_part) - 2 or \ 304 not sl.isdisjoint(i[0] for i in loop): 305 yield 306 loop.append(last_part) 307 length += sorted_sssp[k1][k2][0] 308 axis = tuple(sorted(i[0] for i in loop)) 309 310 if (must_include is None or must_include.issubset(new_visited)) and \ 311 (axis_must_include is None or axis_must_include.issubset(axis)): 312 if axis in loops: 313 # check if smaller than existed 314 if loops[axis]['length'] < length: 315 yield 316 else: 317 max_iter[0] += 1 318 max_iter[0] -= 1 319 loops[axis] = { 320 'loop': loop, 321 'length': length, 322 } 323 # start 324 elif len(loop) == 0: 325 for k1, k2, w in connection: 326 v = sorted_sssp[k1][k2][1] 327 yield dfs_loop_search([v], set(v), w) 328 else: 329 k1 = loop[-1][-1] # last axis 330 for k, (w, c) in sorted_sssp[k1].items(): 331 sc = set(c[1:]) 332 new_visited = visited | sc 333 if len(new_visited) == len(visited) + len(c) - 1 or \ 334 allow_knots and sc.isdisjoint(i[0] for i in loop): 335 yield dfs_loop_search(loop + [c], new_visited, length + w) 336 yield 337 338 gen = dfs_loop_search([], set(), 0) 339 stk = [] 340 while top is None or max_iter[0] > 0: 341 if isinstance(gen, typing.Generator): 342 stk.append(gen) 343 gen = next(gen) 344 else: 345 stk.pop() 346 if not stk: 347 break 348 gen = stk[-1].send(gen) 349 350 return loops
Using connected single source shortest paths to find loops. You should ensure that in your graph smaller edge weights means stronger connection for this method.
First, use Bellman Ford to find the shortest paths between all regions. Then loops can be found by pairing any shortest paths like A -> B and B -> A. You can also find loops with more anchors' shortest path like A -> B, B -> C, C -> A, and even more. But with more anchors, the computation will be greatly slowed down, and number of discovery will be greatly expanded.
When the search number is very great, you can use top
to have an early stop. The shortest path are sorted such that a
loop with minimal shortest path element will be retrieved first, but it won't guarantee the final loop will be ordered
ascending. Besides, you can also set the must included nodes in the loop so that those without are skipped.
The method attempts to located unique loops by the axes, i.e. one set of axes corresponding to one loop. It's likely that multiple axes can be connected in multiple ways, in which case once finding a shorter one a replacement will occur. The purpose is for balance between the uniqueness and computation. Since the shortest path will also be returned, you can check other connection ways easily.
This can fail to give a loop between a set of axis, because there may be overlap between their shortest path, thus giving rise to smaller loops, whereas there may exist an exact loop that contain non shortest paths but unique vertices. Therefore, the shortest path from A to B here is better off interpreted as the most preferable path from A to B, which means other paths from A to B are not considered.
Parameters
- n_axis: the number to consider for connecting shortest path into a loop. Larger n_axis will significantly increase the computation
- top: the most prominent results. This can speed up as it searches from the most probable loops and can have an early stop. Default as None to turn off.
- axis_pool: a list of vertices for axes to choose from, which can largely narrow down the search space. Default as None to turn off.
- must_include: the vertices that must appear in the search result.
- axis_must_include: the vertices that must appear in axes, and must be subset of
axis_pool
. This check is independent ofmust_include
. - allow_knots: whether to allow duplicate nodes (but not duplicated with axes).
- priority: the edge weight aggregation function for shortest path sorting. Default as sum, you can also use, for example, mean to make up for shortest path that travel through many vertices.
Returns
a dict of axis mapped to loop in the form of loops[(a,b,c)] -> [[a, ..., b], [b, ..., c], [c, ..., a]], where the key (a,b,c) is a sorted tuple of axes, doesn't necessarily mean a -> b -> c -> a, and shortest path in the form of sssp[from_node][to_node] -> [from_node, ..., to_node]
352 def pair_complement(self, axis_pool: typing.Iterable = None): 353 """ 354 Given a pair of vertices, the loop is defined as the direct connection between them plus the inverted shortest path. 355 For example, with A and B, the loop can be either edge AB + the shortest path from B to A 356 or edge BA + the shortest path from A to B. This comes in handy when shortest paths from A to B and B to A have 357 intersection, leading to knots in a loop. 358 359 This can also be much faster and doesn't need speedup or sorting. 360 361 :param axis_pool: a list of vertices for axes to choose from, which can largely narrow down the search space. 362 Default as None to turn off, and it will try every pair of vertices as long as there's a direct edge. 363 :return: a dict of loops, keys are the direct edges and values are the corresponding inverse shortest path. 364 """ 365 if axis_pool is not None: 366 axis_pool = set(axis_pool) 367 if self.sssp is None: 368 self.init() 369 out = {} 370 for fro, to in self.graph.edges: 371 if axis_pool is not None and (fro not in axis_pool or to not in axis_pool) or to == fro: 372 continue 373 if to in self.sssp and fro in self.sssp[to]: 374 out[(fro, to)] = self.sssp[to][fro] 375 return out
Given a pair of vertices, the loop is defined as the direct connection between them plus the inverted shortest path. For example, with A and B, the loop can be either edge AB + the shortest path from B to A or edge BA + the shortest path from A to B. This comes in handy when shortest paths from A to B and B to A have intersection, leading to knots in a loop.
This can also be much faster and doesn't need speedup or sorting.
Parameters
- axis_pool: a list of vertices for axes to choose from, which can largely narrow down the search space. Default as None to turn off, and it will try every pair of vertices as long as there's a direct edge.
Returns
a dict of loops, keys are the direct edges and values are the corresponding inverse shortest path.