postgresql - Complex query for returning an email conversation list -



postgresql - Complex query for returning an email conversation list -

i have complex query not working in sql fiddle.

in app work for, sync users gmail our database. store emails in emails table , have replies table store references header lists parent replies email.

so example, if have email this:

id | subject | message_id --------------------------------------------------------------------------------------------- 1 | howzitgoin | 53856b1448c89_23fa9605badd015951@3a139e8c-0b81-42c2-8e59-133c262e96a9.mail

there no records in replies table:

now if import reply email this:

id | subject | message_id --------------------------------------------------------------------------------------------- 2 | re: howzitgoin | caebv8ytu_a6ltp_uguq-qsvj3zojwuiwcjgzpsppez1pj3_i1a@mail.gmail.com

we store next in replies table:

email_id | message_id ------------------------------------------------------------------------------------------ 2 | 53856b1448c89_23fa9605badd015951@3a139e8c-0b81-42c2-8e59-133c262e96a9.mail

and if received reply this:

id | subject | message_id --------------------------------------------------------------------------------------------- 3 | re: howzitgoin | 53856b88a2a09_23fa9605badd01601b@3a139e8c-0b81-42c2-8e59-133c262e96a9.mail

we store next in replies table:

email_id | message_id --------------------------------------------------------------------------------------------- 3 | 53856b1448c89_23fa9605badd015951@3a139e8c-0b81-42c2-8e59-133c262e96a9.mail 3 | caebv8ytu_a6ltp_uguq-qsvj3zojwuiwcjgzpsppez1pj3_i1a@mail.gmail.com

after much head scratching, came query:

"ranked_replies" ( select "r"."email_id", "r"."message_id", "rnk" (select *, rank() on (partition "message_id" order "email_id" desc) "rnk" "replies") "r" inner bring together "emails" on ("emails"."message_id" = "r"."message_id") ), "count_of_replies" ( select "email_id", count(*) "count", count(*) "thread_count" "ranked_replies" grouping "email_id" ) select distinct "emails".*, "thread_count" "emails" left bring together "count_of_replies" on ("emails"."id" = "count_of_replies"."email_id") ( ("folder" = 'inbox') , ( ("emails"."message_id" not in (select "message_id" "ranked_replies" ("rnk" != 1))) or ("emails"."message_id" null) ) , ("emails"."id" not in (select "email_id" "ranked_replies" ("rnk" != 1))) ) order "created_at" desc limit 50 offset 0

the problem is not returning email thread has subject of 'not returning'.

the reason because of part of clause:

("emails"."message_id" not in (select "message_id" "ranked_replies" ("rnk" != 1))

this excludes root email subject of 'not returning' because has 2 rows in ranked_replies rank 1 , 2.

i want query that:

show e-mails have no replies them (e.g. not in threads) show top end of each thread, if multiple end nodes of each thread exist want one. show e-mails in current folder (inbox).

referring sqlfiddle example: should homecoming e-mails: #5 (highest ranking of thread 1), #8 (highest ranking of thread 2), #9 (not in thread) , #10 (not highest ranking, 1 of thread in inbox)

i'm having problem #10.

the "not returning" e-mails not returning because top 1 has replies, , lower ones not in "inbox".

as sidenode: have "app" suppose enhance transferring metadata of (at to the lowest degree until specific amount) mails clients , sorting / filtering there. scale improve (and quicker in user experience) having database thinking, depending on userbase etc. question:

i'm quite uncertain on why utilize rank() anyway, dropped it. if want go on using because of other requirements overlooked can so: utilize rank on subselect on "replies" handles rows replies in current folder.

maybe want skip selection of "threadid" in solution , solve inserting id in first place. or create unique id per thread yourself.

as don't know gmail way of showing mails assume want following:

show e-mails have no replies them (e.g. not in threads) show top end of each thread, if multiple end nodes of each thread exist assume want one. care e-mails in current folder.

i created this sql fiddle that. in there changed database model reference message replied primary key (id) instead of message_id. numerical sequence can used address thread tree , did so.

this solution:

with "thread" ( -- select uppermost id per thread select r."email_id", min("reply_to_id") "threadid" "replies" r inner bring together "emails" e on r.email_id = e.id -- create tree current folder , e.folder = 'inbox' grouping r."email_id" ), "lastmail" ( -- select highest email per thread select t."threadid", max(r."email_id") "lastmail" replies r inner bring together thread t on t.threadid = r.reply_to_id grouping t."threadid" ), "count_of_replies" (select r."email_id", count(r.*) "thread_count" replies r inner bring together thread t on t.threadid = r.reply_to_id grouping r."email_id") select distinct "emails".*, "thread_count" "emails" left bring together "count_of_replies" on ("emails"."id" = "count_of_replies"."email_id") ( -- current folder ("folder" = 'inbox') , ( -- ones in no thread ("emails"."id" not in (select "email_id" "thread" union select "threadid" "thread")) or -- ones top in thread ("emails"."id" in (select "lastmail" "lastmail")) ) ) order "created_at" desc limit 50 offset 0

postgresql psql

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -