Introduction - Neo4j Graph Data Science 知识图谱

Introduction - Neo4j Graph Data Science

本站和网页 https://neo4j.com/docs/graph-algorithms/current/introduction/#introduction-algorithms 的作者无关，不对其内容负责。快照谨为网络故障时之索引，不代表被搜索网站的即时页面。

Introduction - Neo4j Graph Data Science
Docs
Docs
Neo4j DBMS
Getting Started
Operations
Migration and Upgrade
Status Codes
Java Developer Reference
Kerberos Add-on
Neo4j Aura
Neo4j Aura
Neo4j AuraDB
Neo4j AuraDS
Neo4j Tools
Neo4j Bloom
Neo4j Browser
Neo4j Desktop
Neo4j Ops Manager
Neo4j Graph Data Science
Neo4j Graph Data Science Library
Cypher Query Language
Cypher
Cypher Cheat Sheet
APOC Library
Neo4j Drivers and APIs
Go Driver
Java Driver
JavaScript Driver
.Net Driver
Python Driver
Neo4j GraphQL Library
HTTP API
OGM Library
Spring Data Neo4j
Neo4j Connector for Apache Spark
Neo4j Connector for Apache Kafka
Labs
arrows.app
Details
arrows.app
APOC
Documentation
Developer Guide
GraphQL & GRANDStack
Details
GRANDstack.io
Halin
Kafka Integration
Details
Documentation
ETL Tool
Details
Documentation
Neo4j Plugin For Liquibase
Details
Documentation
Neosemantics
Getting Started
Documentation
Neo4j Helm
Details
Documentation
NeoDash
Details
Documentation
Get Help
GraphAcademy
Free, Self Paced Training
Certification
Community Forum
Discord Chat
Knowledge Base
Neo4j Developer Blog
Neo4j Videos
Get Started
Neo4j AuraDB Start Free
Start your fully managed Neo4j cloud database
Neo4j Sandbox
Learn and use Neo4j for data science & more
Neo4j Desktop
Manage multiple local or remote Neo4j projects
Search
Neo4j Version
Neo4j Graph Data Science 2.2
2.3-preview
2.2
The Neo4j Graph Data Science Library Manual v2.2
Introduction
Installation
Supported Neo4j versions
Neo4j Desktop
Neo4j Server
Enterprise Edition Configuration
Neo4j Docker
Neo4j Causal Cluster
Apache Arrow
Additional configuration options
System Requirements
Common usage
Memory Estimation
Projecting graphs
Running algorithms
Logging
Monitoring system
System Information
Graph management
Graph Catalog
Projecting graphs using native projections
Projecting graphs using Cypher
Projecting graphs using Cypher Aggregation
Projecting graphs using Apache Arrow
Projecting a subgraph
Random walk with restarts sampling
Random graph generation
Listing graphs
Check if a graph exists
Removing graphs
Node operations
Relationship operations
Export operations
Apache Arrow operations
Node Properties
Utility functions
Cypher on GDS graph
Administration
Backup and Restore
Defaults and Limits
Graph algorithms
Syntax overview
Centrality
PageRank
Article Rank
Eigenvector Centrality
Betweenness Centrality
Degree Centrality
Closeness Centrality
Harmonic Centrality
HITS
Influence Maximization
CELF
Greedy
Community detection
Louvain
Label Propagation
Weakly Connected Components
Triangle Count
Local Clustering Coefficient
K-1 Coloring
Modularity Optimization
Strongly Connected Components
Speaker-Listener Label Propagation
Approximate Maximum k-cut
Conductance metric
Modularity metric
K-Means Clustering
Leiden
Similarity
Node Similarity
Filtered Node Similarity
K-Nearest Neighbors
Filtered K-Nearest Neighbors
Similarity functions
Path finding
Delta-Stepping Single-Source Shortest Path
Dijkstra Source-Target Shortest Path
Dijkstra Single-Source Shortest Path
A* Shortest Path
Yen’s algorithm Shortest Path
Minimum Weight Spanning Tree
All Pairs Shortest Path
Random Walk
Breadth First Search
Depth First Search
Node embeddings
Fast Random Projection
GraphSAGE
Node2Vec
Topological link prediction
Adamic Adar
Common Neighbors
Preferential Attachment
Resource Allocation
Same Community
Total Neighbors
Auxiliary procedures
Collapse Path
Scale Properties
One Hot Encoding
Split Relationships
Random walk with restarts sampling
Pregel API
Machine learning
Pre-processing
Node embeddings
Fast Random Projection
GraphSAGE
Node2Vec
Node property prediction
Node classification pipelines
Configuring the pipeline
Training the pipeline
Applying a trained model for prediction
Node regression pipelines
Configuring the pipeline
Training the pipeline
Applying a trained model for prediction
Link prediction pipelines
Configuring the pipeline
Training the pipeline
Applying a trained model for prediction
Theoretical considerations
Pipeline catalog
Listing pipelines
Checking if a pipeline exists
Removing pipelines
Model catalog
Listing models
Checking if a model exists
Removing models
Storing models on disk
Publishing models
Training methods
Logistic regression
Random forest
Multilayer Perceptron
Linear regression
Auto-tuning
End-to-end examples
FastRP and kNN example
Production deployment
Transaction Handling
Using GDS and Fabric
GDS with Neo4j Causal Cluster
GDS Feature Toggles
Python client
Appendix
Operations reference
Graph Catalog
Graph Algorithms
Machine Learning
Additional Operations
Migration from Graph Data Science library Version 1.x
Common changes
Graph projection
Graph listing
Graph drop
Memory estimation
Algorithms
Machine Learning
Neo4j Graph Data Science
Introduction
Introduction
The Neo4j Graph Data Science (GDS) library provides efficiently implemented, parallel versions of common graph algorithms, exposed as Cypher procedures.
Additionally, GDS includes machine learning pipelines to train predictive supervised models to solve graph problems, such as predicting missing relationships.
1. API tiers
The GDS API comprises Cypher procedures and functions.
Each of these exist in one of three tiers of maturity:
Production-quality
Indicates that the feature has been tested with regards to stability and scalability.
Features in this tier are prefixed with gds.<operation>.
Beta
Indicates that the feature is a candidate for the production-quality tier.
Features in this tier are prefixed with gds.beta.<operation>.
Alpha
Indicates that the feature is experimental and might be changed or removed at any time.
Features in this tier are prefixed with gds.alpha.<operation>.
The Operations Reference, lists all operations in GDS according to their tier.
2. Algorithms
Graph algorithms are used to compute metrics for graphs, nodes, or relationships.
They can provide insights on relevant entities in the graph (centralities, ranking), or inherent structures like communities (community-detection, graph-partitioning, clustering).
Many graph algorithms are iterative approaches that frequently traverse the graph for the computation using random walks, breadth-first or depth-first searches, or pattern matching.
Due to the exponential growth of possible paths with increasing distance, many of the approaches also have high algorithmic complexity.
Fortunately, optimized algorithms exist that utilize certain structures of the graph, memoize already explored parts, and parallelize operations.
Whenever possible, we’ve applied these optimizations.
The Neo4j Graph Data Science library contains a large number of algorithms, which are detailed in the Algorithms chapter.
2.1. Algorithm traits
Algorithms in GDS have specific ways to make use of various aspects of its input graph(s).
We call these algorithm traits.
When an algorithm supports an algorithm trait this indicates that the algorithm has been implemented to produce well-defined results in accordance with the trait.
The following algorithm traits exist:
Directed
The algorithm is well-defined on a directed graph.
Undirected
The algorithm is well-defined on an undirected graph.
Homogeneous
The algorithm will treat all nodes and relationships in its input graph(s) similarly, as if they were all of the same type.
If multiple types of nodes or relationships exist in the graph, this must be taken into account when analysing the results of the algorithm.
Heterogeneous
The algorithm has the ability to distinguish between nodes and/or relationships of different types.
Weighted
The algorithm supports configuration to set node and/or relationship properties to use as weights.
These values can represent cost, time, capacity or some other domain-specific properties, specified via the nodeWeightProperty, nodeProperties and relationshipWeightProperty configuration parameters.
The algorithm will by default consider each node and/or relationship as equally important.
3. Graph Catalog
In order to run the algorithms as efficiently as possible, GDS uses a specialized graph format to represent the graph data.
It is therefore necessary to load the graph data from the Neo4j database into an in memory graph catalog.
The amount of data loaded can be controlled by so called graph projections, which also allow, for example, filtering on node labels and relationship types, among other options.
For more information see Graph Management.
4. Editions
The Neo4j Graph Data Science library is available in two editions.
The open source Community Edition:
Includes all algorithms.
Most of the catalog operations to manage graphs, models and pipelines are available. Unavailable operations are listed below.
Limits the concurrency to 4 CPU cores.
Limits the capacity of the model catalog to 4 models.
The Neo4j Graph Data Science library Enterprise Edition:
Can run on an unlimited amount of CPU cores.
Supports the role-based access control system (RBAC) from Neo4j Enterprise Edition.
Support running GDS as part of a cluster deployment.
Includes capacity and load monitoring.
Supports various additional graph catalog features, including:
Graph backup and restore.
Data import and export via Apache Arrow.
Supports various additional model catalog features, including:
Storing unlimited amounts of models in the model catalog.
Sharing of models between users, by publishing it.
Model persistence to disk.
Supports an optimized graph implementation.
Allows the configuration of defaults and limits.
For more information see System Requirements - CPU.
The Neo4j Graph Data Science Library Manual v2.2
Installation
Was this page helpful?
© 2022 Neo4j, Inc.
Terms | Privacy | Sitemap
Neo4j®, Neo Technology®, Cypher®, Neo4j® Bloom™ and
Neo4j® Aura™ are registered trademarks
of Neo4j, Inc. All other marks are owned by their respective companies.
Contact Us →
US: 1-855-636-4532
Sweden +46 171 480 113
UK: +44 20 3868 3223
France: +33 (0) 8 05 08 03 44
Learn
Sandbox
Neo4j Community Site
Neo4j Developer Blog
Neo4j Videos
GraphAcademy
Neo4j Labs
Social
Twitter
Meetups
Github
Stack Overflow
Want to Speak? Get $ back.