3
7 Comments

Any recommendations for Web Scraper Cloud service?

I am looking to a build a quick proof of concept for idea validation.

The product would be built on data scraped from public webpages. Instead of writing a scraper in the backend, I was hoping to make it completely client-side and instead use any webscraper cloud service.

Does anyone have any recommendations for such a service?

  1. 1

    I recommend using Scraping Fish API for MVP to validate your idea quickly. We have very user-friendly and flexible pricing model, especially for startups. You don't have to commit to any specific monthly scraping volume and you won't lose any unused requests at the end of the month!

    1. 1

      Thanks @mateuszbuda
      I gave it a try but my use case is I want to do the following:

      1. A user visits a website - say google.com
      2. User punches in "Top 10 affordable cars in US"
      3. I wish to extract out the first link that google search results.

      I am not sure if Scraping Fish API supports this.

      1. 1

        We do support this. Here is a blog post on scraping Google SERP which should help you: https://scrapingfish.com/blog/google-serp-geolocation
        If you’re looking for a more generic solution, you can use JS scenario to interact with a website, for example input text and click on a button: https://scrapingfish.com/docs/js-scenario
        If you run into any issues, don’t hesitate to contact us.

        1. 1

          I see. I will give it a try. js-scenario is more of what I was looking for. Thank you.

  2. 1

    If you choose backend scaping then Puppeteer is the best choice in my opinion. But if I am correct you cannot use it from scrapping on the client side (from the browser directly).

    I build some kinds of scrappers before. And what I can tell you is that, if a site has an API use API instead. Scrapping is forbidden from most of the social networks out there, and also pages are changing constantly. Trying to gather data from a source where HTML is constantly changing is also pretty painful. (For example from Facebook)

    Once I was using Puppeteer to gather information about daily menus in nearby restaurants at my work. Just one simple change in the HTML of that source can cause to not work at all.

    Also, I am not sure about making ajax requests from other domains. There may be some CORS policy issues.

    But there are definitely plenty of business ideas around scrapping.

    1. 1

      Thanks for the reply @hegi55

      Agreed on API first if available and the drawbacks of scraping.

      Puppeteer would mean I would have to spend time building out the scraping code and I was hoping to use a service where I can simply automate the scraping part and use the client side to trigger the scrape.

Trending on Indie Hackers
Passed $7k 💵 in a month with my boring directory of job boards 39 comments Reaching $100k MRR Organically in 12 months 32 comments 87.7% of entrepreneurs struggle with at least one mental health issue 14 comments How to Secure #1 on Product Hunt: DO’s and DON'Ts / Experience from PitchBob – AI Pitch Deck Generator & Founders Co-Pilot 11 comments Competing with a substitute? 📌 Here are 4 ad examples you can use [from TOP to BOTTOM of funnel] 10 comments Are you wondering how to gain subscribers to a founder's X account from scratch? 9 comments