graphx图创建的三种方法代码
时间: 2024-03-26 15:40:26 浏览: 16
GraphX 是 Apache Spark 的图处理框架,以下是 GraphX 中常用的三种创建图的方式的代码示例:
1. 通过顶点和边的 RDD 创建图:
```python
from pyspark import SparkContext
from pyspark.sql import SparkSession
from graphframes import *
sc = SparkContext.getOrCreate()
spark = SparkSession(sc)
# 创建顶点和边的 RDD
vertices = spark.createDataFrame([(1, "Alice"), (2, "Bob"), (3, "Charlie"), (4, "David")], ["id", "name"])
edges = spark.createDataFrame([(1, 2), (2, 3), (3, 1), (4, 1)], ["src", "dst"])
# 创建图对象
g = GraphFrame(vertices, edges)
```
2. 通过 GraphFrame 的构造函数创建图:
```python
from pyspark.sql import SparkSession
from graphframes import *
spark = SparkSession.builder.appName("GraphX").getOrCreate()
# 创建 GraphFrame 对象
g = GraphFrame(
vertices=spark.createDataFrame([(1, "Alice"), (2, "Bob"), (3, "Charlie"), (4, "David")], ["id", "name"]),
edges=spark.createDataFrame([(1, 2), (2, 3), (3, 1), (4, 1)], ["src", "dst"])
)
```
3. 通过 GraphFrame 的 fromGraphX 方法将 GraphX 的 Graph 对象转换为 GraphFrame:
```python
from pyspark import SparkContext
from pyspark.sql import SparkSession
from graphframes import *
from pyspark.graphx import *
sc = SparkContext.getOrCreate()
spark = SparkSession(sc)
# 创建 GraphX Graph 对象
vertex_list = [(1, "Alice"), (2, "Bob"), (3, "Charlie"), (4, "David")]
edge_list = [(1, 2), (2, 3), (3, 1), (4, 1)]
vertices = sc.parallelize(vertex_list)
edges = sc.parallelize(edge_list)
graphx = Graph(vertices, edges)
# 转换为 GraphFrame 对象
g = GraphFrame.fromGraphX(graphx, spark)
```