Hello everyone, I’m Hydra.
Although there are still two long days before the Mid-Autumn Festival holiday, there is also good news, today is the day when “Thor 4” is launched on Disney+ streaming (that is to say, we can see you on the Internet later) ~
Those who understand Norse mythology should know that its mythological system can be described in one word, that is, “chaos”! Just like the intricate network of relationships below in Thor 3, it can only be regarded as one and a half knots.
And in the previous article, we introduced some basic theoretical knowledge about the knowledge graph, as the saying goes, just say not to practice the false handle, today we will take a look at how to achieve and present this complex character relationship map in the springboot project.
This article will use the following main modules to build the connections between entities in nature and implement the knowledge graph description:
The underlying layer of the knowledge graph relies on the key graph database, and here we choose Neo4j, a high-performance nosql graph database that stores structured data in graphs rather than tables.
First install, open the official website to download the installation package of Neo4j, download the free community community version on it, the address is placed below:
It should be noted that the neo4j 4.x or above versions need to rely on the jdk11 environment, so if the running environment is jdk8, then it is still honest to download the 3.x version on the line, after the download and decompression is completed, start it by command in the bin directory:
After booting, you can access port 7474 of the installation server in your browser to open the console page of neo4j:
From the navigation bar on the left, we can view the stored data, some examples of the underlying queries, and some help instructions.
The input box with the $ symbol at the top can be used to enter neo4j-specific CQL query statements and execute, the specific syntax we put below to describe.
Just like we usually use SQL statements in relational databases, neo4j can use Cypher query language (CQL) for graph database queries, let’s briefly look at the use of adding, deleting and correcting.
In CQL, you can create a node by using the CREATE command, and the syntax for creating a node without attributes is as follows:
In the CREATE statement, two basic elements are included, the node name node-name and the tag name lable-name. The label name is equivalent to the table name in the relational database, and the node name refers to this piece of data.
Taking the following CREATE statement as an example, it is equivalent to creating an empty piece of data without attributes in the Person table.
When you create a node that contains an attribute, you can append a json string depicting the attribute after the tag name:
Create a node with the following statement:
After creating a node, we can use the MATCH matching command to query the existing node and attribute data, and the format of the command is as follows:
Usually, the MATCH command is used later with RETURN, DELETE and other commands to perform specific operations such as return or delete.
Execute the following command:
To view the displayed results of the visualization:
You can see the two nodes added above, empty nodes without attributes and nodes containing attributes, and all nodes will have a default generated id as a unique identifier.
Next, we delete the previously created useless node that does not contain attributes, as mentioned above, and needs to be deleted with MATCH with DELETE.
In this delete statement, the WHERE filter condition is additionally used, which is very similar to WHERE in SQL, and the command is filtered by the id of the node.
After the deletion is complete, perform the query operation again, and you can see that only one node of the Rocky is retained:
In the neo4j graph database, we follow the property graph model to store and manage data, which means that we can maintain relationships between nodes.
Above we created a node, so we need to create another node as both ends of the relationship:
The basic syntax for creating a relationship is as follows:
Of course, you can also use the existing nodes to create a relationship, the following we use MATCH to query first, and then associate the results to create an association relationship between the two nodes:
After the addition is complete, you can query the nodes and relationships that meet the conditions through the relationship:
You can see that an association has been added between the two:
It should be noted that if a node is added to the association relationship, simply deleting the node will report an error:
In this case, you need to delete the association relationship at the same time as you delete the node:
Executing the above statement deletes the associated relationships that it contains while deleting the node.
So, the simple cql statement is so far to get started, it has basically satisfied our simple business scenario, let’s start to integrate neo4j in springboot.
Create a springboot project, which uses version 2.3.4 and introduces neo4j dependent coordinates:
Configure neo4j connection information in application.yml:
If you are very skilled in the application of jpa, then the next process can be said to be a familiar way, because they are basically a pattern, the same is to build the model layer, the repository layer, and then on this basis to operate the custom or template method can be.
We can use annotation-based entity mappings to describe a node in a diagram, by adding a @NodeEntity on the entity class to indicate that it is a node entity in the diagram, and adding a @Property on an attribute means that it is a concrete property in the node.
Such an entity class represents that the instance node it creates
Build a persistent layer interface to the above entities, inherit the Neo4jRepository interface, and add @Repository annotations to the interface.
Two methods have been added to the interface for later tests, selectAll() to return all data, and findByName() to query specific nodes based on name.
Next, call the template method of the repository layer in the service layer:
The front end calls the save() interface, adds a node, and then queries the console with a query statement, and you can see that the new node has been added to the diagram through the interface:
Add another method to the service that queries all nodes and directly calls the selectAll() method we defined in the NodeRepository:
The query results are printed in the console:
We will introduce the operation of the node here, and then start to build the association relationship between the nodes.
In neo4j, an association can also be seen as a special kind of entity, so it can be described by entity classes. Unlike nodes, you need to add @RelationshipEntity annotations to the class and specify the beginning and end nodes of the association relationship by @StartNode and @EndNode.
Again, let’s create a persistent layer interface for it:
In the interface, a custom method for querying the association relationship based on the starting node, the end node, and the association content is customized, which we will use later.
In the service layer, create a method that provides an association relationship based on the node name:
After calling this method through the interface, binding the relationship between Hela and Thor, query the result:
When building a knowledge graph in a project, a large part of the scenario is based on unstructured data, rather than the nodes or relationships in the graph that we enter manually to determine. Therefore, we need the ability to extract knowledge based on text, in short, to extract the SPO subject-verb object triplet in a piece of text to form the points and edges in the map.
Here we use a ready-made tool class on Git to perform semantic analysis of text and extraction of SPO triples, project address:
Although this project is relatively simple and has two classes and two resource files, the utility class can effectively help us complete the extraction of the subject-verb object in the sentence, and we need to introduce dependent coordinates before using it:
Then copy the two classes under the com.hankcs.nlp.lex package in this project to our project, and copy the models directory under resources to our resources.
After completing the above steps, call the methods in the MainPartExtractor tool class to perform the following simple text SPO extraction test:
In the processing result MainPart, the more important is the subject, predicate and object three properties, their type is TreeGraphNode, encapsulating the subject-verb object component of the sentence. Let’s take a look at the test results:
As you can see, if there is a clear subject-verb object in the sentence, then it is extracted. If an item is empty, the item is null and the rest of the sentence structure can be extracted normally.
Building on the above, we can dynamically build a knowledge graph in our project by creating a new TextAnalysisServiceImpl that implements two key methods.
The first is the method of creating a node in neo4j based on the subject or object extracted in the sentence, where the node is judged to be a node that already exists according to its name, returns directly if it exists, and adds if it does not exist:
Then there is the core method, to put it bluntly, it is also very simple, the parameter is passed into a sentence as text to extract the spo first, the entity is Node saved, and then see if there is already a relationship with the same name, if it does not exist, the association relationship is created, and if it exists, it is not created repeatedly. Here is the key code:
Create a simple controller interface for receiving text:
Next, we pass in the following sentence text from the front end for testing:
After the call is complete, let’s look at the graphical relationship in neo4j, and we can see that Hela, Death, Thor, and Hammer are linked together:
At this point, a simple text processing and map creation process is completely strung together, but this process is still relatively rough, and then it needs to continue to be optimized in the following aspects:
In short, there are still many parts that need to be improved, the project code I also passed to git, if you are interested in it, you can see, if there is time in the follow-up, I will continue to improve based on this version, the public number background reply “neo” to get the project address.
Well, that’s it for this sharing, I’m Hydra, we’ll see you next time.
Official account background reply
The “356” — received more than 100 back-end books
“Interview” — to collect the interview materials of the factory
Map — receive 24 Java back-end learning note maps
“Architecture” — receive 29 Java Architect e-books
“Practice” — receive the Springboot Combat Project
Pay attention to the official account
Fun, deep, direct
Talk to you about technology
Think it’s useful, let’s have a four-in-a-row~