cypher - Slow Berlin sparql benchmark queries in Neo4j -



cypher - Slow Berlin sparql benchmark queries in Neo4j -

i trying berlin benchmark sparql queries in neo4j. have created neo4j graph triples using http://michaelbloggs.blogspot.de/2013/05/importing-ttl-turtle-ontologies-in-neo4j.html

to summarize info loading, graph has next structure,

subject => node predicate => relationship object => node

if predicate date, string, integer (primitive) property created instead of relationship , stored in node.

now, trying next queries slow in noe4j,

query 4: feature highest ratio between cost feature , cost without feature. corresponding sparql query this, prefix bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/> prefix bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/> prefix xsd: <http://www.w3.org/2001/xmlschema#> select ?feature ((?sumf*(?counttotal-?countf))/(?countf*(?sumtotal-?sumf)) ?priceratio) { { select (count(?price) ?counttotal) (sum(xsd:float(str(?price))) ?sumtotal) { ?product <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/producttype294> . ?offer bsbm:product ?product ; bsbm:price ?price . } } { select ?feature (count(?price2) ?countf) (sum(xsd:float(str(?price2))) ?sumf) { ?product2 <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/producttype294> ; bsbm:productfeature ?feature . ?offer2 bsbm:product ?product2 ; bsbm:price ?price2 . } grouping ?feature } } order desc(?priceratio) ?feature limit 100 cypher query created this, match p1 = (offer1:offer)-[r1:`product`]->(products1:producttype294) match p2 = (offer2:offer)-[r2:`product`]->products2:producttype294)-[:`productfeature`]->features homecoming (sum( distinct offer2.price) * ( count( distinct offer1.price) - count( distinct offer2.price)) /(count(distinct offer2.price)*(sum( distinct offer1.price) - sum(distinct offer2.price)))) cnt,features.__uri__ frui order cnt desc,frui

this query slow, please allow me know whether formulating query in wrong way.

another query query 5: show popular products of specific product type each country - review count , prefix bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/> prefix bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/> prefix rev: <http://purl.org/stuff/rev#> prefix xsd: <http://www.w3.org/2001/xmlschema#> select ?country ?product ?nrofreviews ?avgprice { { select ?country (max(?nrofreviews) ?maxreviews) { { select ?country ?product (count(?review) ?nrofreviews) { ?product <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/producttype403> . ?review bsbm:reviewfor ?product ; rev:reviewer ?reviewer . ?reviewer bsbm:country ?country . } grouping ?country ?product } } grouping ?country } { select ?product (avg(xsd:float(str(?price))) ?avgprice) { ?product <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/producttype403> . ?offer bsbm:product ?product . ?offer bsbm:price ?price . } grouping ?product } { select ?country ?product (count(?review) ?nrofreviews) { ?product <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/producttype403> . ?review bsbm:reviewfor ?product . ?review rev:reviewer ?reviewer . ?reviewer bsbm:country ?country . } grouping ?country ?product } filter(?nrofreviews=?maxreviews) } order desc(?nrofreviews) ?country ?product cypher query created following, match (products2:producttype403)<-[:`reviewfor`]-(reviews:review)-[:`reviewer`]->(rvrs)-[:`country`]->(countries) count(reviews) reviewcount,products2.__uri__ pruis, countries.__uri__ cntrs match (products1:producttype403)<-[:`product`]-(offer:offer) avg(offer.price) avgprice, max(reviewcount) maxrevs, cntrs match (products2:producttype403)<-[:`reviewfor`]-(reviews:review)-[:`reviewer`]->(rvrs)-[:`country`]->(countries) avgprice, maxrevs,countries, count(reviews) rvs, countries.__uri__ curis, products2.__uri__ puris maxrevs=rvs homecoming curis,puris,rvs,avgprice

even query slow. formulating queries in right way?

i had 10m triples (berlin benchmark dataset) every type predicate converted label. (for query 4) i'm trying feature highest ratio between cost with that feature , cost without feature. right way formulate query? (for query 4) right results query. if don't compute sum , count query gets executed real fast.

thanks in advance :) sparql queries , info can found @ : http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/spec/businessintelligenceusecase/index.html#queries

these global graph queries me? size of dataset?

you create cartesian product between 2 paths? shouldn't 2 paths somehow connected ?

shouldn't there property type on producttype label? (:producttype {type:"294"}) , if there you'd have index on :producttype(type) , :order(orderno)

i don't understand calculation?

delta of count distinct prices times sum of distinct prices of offer 2 count of distinct prices of offer 2, times delta of sum of 2 order prices?

match (offer1:offer)-[r1:`product`]->(products1:producttype294) match (offer2:offer)-[r2:`product`]->(products2:producttype294)-[:`productfeature`]->features homecoming (sum( distinct offer2.price) * ( count( distinct offer1.price) - count( distinct offer2.price)) / (count(distinct offer2.price)* (sum( distinct offer1.price) - sum(distinct offer2.price)))) cnt,features.__uri__ frui order cnt desc,frui

neo4j cypher graph-databases

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -