Is RNeo4j Transactional Endpoint slow? #52

mamonu · 2016-03-09T01:15:08Z

Got used to import a csv via the neo4j console. I had 50000 rows.
After setting up an index I imported them in about 0.8 sec

Tried the same thing today with the transactional endpoint and it took 3 mins.

Is it that slow or am I doing something wrong?

nicolewhite · 2016-03-09T01:23:46Z

Can you show me your code?

mamonu · 2016-03-09T10:30:23Z

sure.
If I go to the Neo4j web interface and I do this for example


CREATE INDEX ON :Person(person_ID)

//# Added 1 index, statement executed in 1662 ms.

USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM "file:///home/data/SPRINT3-a-v1.csv"
AS row MERGE (a:Person { person_ID: row.person_id1 , source:"a"}) RETURN (a)

//#Returned 12613 rows in 894 ms

Now of I run the following code for the same size :
(after loading the data in a dataframe called data!


library (RNeo4j)
graph = startGraph("http://localhost:7474/db/data/")


addIndex(graph, "Person", "person_ID")
getIndex(graph)

t1 <- Sys.time()


query = ' MERGE (a:Person { person_ID: {person_ID} , source:"a"})'


t = newTransaction(graph)

for (i in 1:nrow(data)) {
  person_ID = data[i, ]$person_id1


  appendCypher(t, 
               query, 
               person_ID = person_ID)
}

commit(t)

t2 <- Sys.time()
t2 - t1

i get

Time difference of 3.250754 mins

mamonu · 2016-03-21T11:08:25Z

Any news about this? I might be doing something wrong but from what I understood this is the way to use the transactional endpoint. But the performance is worrisome.

nicolewhite · 2016-03-21T17:08:29Z

Sorry, thought I had responded to you. The problem is that you're committing in batches of 1000 in LOAD CSV and in a single batch of 12613 in the R code. It's not really a fair comparison. Can you commit in batches of 1000 in your R code and get back to me?

mamonu · 2016-03-21T17:20:04Z

ok will do that and will get back to you

sdoyen · 2016-04-09T09:09:27Z

Any workaround for this? - Thanks

mamonu · 2016-04-18T16:05:51Z

Apologies for the long delay some other projects took my time...
Back to the problem in hand...I run the following code which loads the same data as the
LOAD CSV command in cypher.


library (RNeo4j)
graph = startGraph("http://localhost:7474/db/data/")
clear(graph)
setwd("/home/bigdata/data/")
data <- read.table(file = "SPRINT3-a-v1.csv",sep=",",header = TRUE)

addIndex(graph, "Person", "person_ID")
getIndex(graph)




query = ' MERGE (a:Person { person_ID: {person_ID} , source:"a"})'

t1 <- Sys.time()
tx = newTransaction(graph)

for (i in 1:nrow(data)) {



  if(i %% 1000 == 0) {
    # Commit current transaction.
    commit(tx)
    print(paste("Batch:", i / 1000, "committed."))
    # Open new transaction.
    tx = newTransaction(graph)
  }



  person_ID = data[i, ]$person_id1


  appendCypher(tx, 
               query, 
               person_ID = person_ID)
}


commit(tx)
print("Last batch committed.")
print("All done!")



t2 <- Sys.time()
t2 - t1

i think that this makes a fair comparison... (load the data in batches of 1000)
I still get 3 mins for the operation.
Apologies if this code is wrong and I have not understood the concept well...

mkllr888 · 2016-06-27T15:35:24Z

I have the same problem. Neither createNode / createRelation nor appendCypher are fast enough to use. My workaround is to use getNode and cypher with normal queries. Also, I create CSV files and import them via READ CSV. Both have the disadvantage that the R Code is not really understandable if the reader doesn't know what cypher/Neo4j is plus creating the CSV files needs storage.

Thanks for your hard work.

nicolewhite · 2016-06-27T16:05:40Z

Sorry, I don't think the transactional endpoint will ever be as fast as LOAD CSV or neo4j-import. createNode() and createRe()l definitely won't be as fast as they are creating nodes / relationships one at a time in a single transaction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is RNeo4j Transactional Endpoint slow? #52

Is RNeo4j Transactional Endpoint slow? #52

mamonu commented Mar 9, 2016

nicolewhite commented Mar 9, 2016

mamonu commented Mar 9, 2016

mamonu commented Mar 21, 2016

nicolewhite commented Mar 21, 2016

mamonu commented Mar 21, 2016

sdoyen commented Apr 9, 2016

mamonu commented Apr 18, 2016 •

edited

Loading

mkllr888 commented Jun 27, 2016

nicolewhite commented Jun 27, 2016

Is RNeo4j Transactional Endpoint slow? #52

Is RNeo4j Transactional Endpoint slow? #52

Comments

mamonu commented Mar 9, 2016

nicolewhite commented Mar 9, 2016

mamonu commented Mar 9, 2016

mamonu commented Mar 21, 2016

nicolewhite commented Mar 21, 2016

mamonu commented Mar 21, 2016

sdoyen commented Apr 9, 2016

mamonu commented Apr 18, 2016 • edited Loading

mkllr888 commented Jun 27, 2016

nicolewhite commented Jun 27, 2016

mamonu commented Apr 18, 2016 •

edited

Loading