ruby on rails - Infinite process -



ruby on rails - Infinite process -

i'm creating rails app, scraps few website's contents. let's - 15 shops , products. scraping infinite process scraps each shop 1 1 , when lastly shop scrapped, worker goes first 1 , whole process starts beginning.

my first thought utilize kind of recursive sidekiq worker scrap shop no. 1 , after success, scraps next shop fireing itself

class fetcherworker include sidekiq::worker def perform(shop_id) shop.find(shop_id).fetch_products fetcherworker.perform_async(next_shop_id) end end

however, have absolutely no experience on ground (such long-running processes) wanted inquire if there best-practive or obvious solution should utilize in next situation? it's quite of import me able access info what's going on , shop beeing scrapped (and sidekiq provides such tools). in advance.

i separate performing work , scheduling it. if job crashes might not reschedule itself. there bootstrap issue (let's on restart) need schedule first job somehow.

i'd add together last_scraped_at timestamp shop model , add together scope :up_for_scraping finds shops have not been scraped n-minutes.

then scheduler finds shops , queues them in sidekiq execution. scheduler can simple ruby script triggered cron start with.

perhaps can create jobs don't run multiple times same shop?

def perform... homecoming if runs_or_has_been_running_within_a_short_period_for_this_shop? ... ... end

this should help not pile more work worker pool can handle. number of jobs queued should remain same on time. if piles up: either scraping code not performant plenty , don't have plenty actors, scraping frequency high, or there not plenty hardware. if empty can scrape more often.

ruby-on-rails ruby

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -