19
39 Comments

AMA: Launched 3 ML startups in 2022 & scaled to $200k ARR.

Hi Indies👋,

I'm an entrepreneur and ML researcher.

As most of you are aware, machine learning (ML) has made incredible leaps forward in the last year. With everything from large language models (GPT-3), visual diffusion models (DALL-E, SD), and gameplay (Pluribus, Cicero).

There are infinite options for indies to apply ML to niche problems and create a thriving business.

I've done this 3 times over the last year, with 2 successful products and 1 meh.

My products have niched down into:

  • US Healthcare
  • SEO Content Creation
  • Business Planning

Models I've taken to market:

  • Cutting-edge image classification and object detection in the browser
  • Fine-tuned NLP models
  • xgboost decision trees
  • Custom digit classification (DNN OCR)
  • Wide-and-deep recommender systems
  • SD Image creation
  • Diffusion Video model
  • Video Lip Sync Model

Thinking about automating something with ML? Want to know how ML can help your business? Not sure what algorithm to experiment with? or where to get enough data?

Ask me. :)

  1. 2

    Wow 200k ARR, well done! So many questions :)

    Is that $0 to $200k in one year?

    Are these data products or services you're offering? I'm curious, you've given the market, but I'm wondering if this is a B2B sale or more business to enterprise? Maybe both?

    1. 3

      Hi @OrionSeven,

      I did some dev work last year on the healthcare product and closed one small deal. The vast majority of dev work and sales occurred in the last year.

      These are service products, I think it would be hard to have a defensible data product as an indie.

      It is a mix of B2B and B2Enterprise sales.

      I do keep accidentally building products that provide orders of magnitude more value to larger companies.

      Which means I can close smaller deals but I can’t get them to pay a premium. Enterprise companies will often pay a substantial premium.

      1. 1

        That's true, my day job is in enterprise software, and the prices are dramatically different than the average B2B.

        I hope you keep up your track record of "accidentally building products that provider orders of magnitude more value". That's a winning tactic! 😂

  2. 2

    Sorry for the delay in responding, I wasn't expecting so many questions and I went skiing today. ⛷

  3. 2

    Hi @0xBADCAFE , vtubers having been trending over the last few years.

    Any thoughts on what aspect of the vtuber space can be applied with ML ?

    1. 1

      Hi @dlowe,

      I had to look up vtuber, I had never heard of it. 🫣

      The vtuber space is going to have a lot of ML applications.

      • Motion-transfer using a diffusion model is super early but will have applications in all types of animation.
      • text-to-speech is becoming very realistic, I could see multi-language vtuber without speaking those languages.
      • lip-sync combines with above will make it easier and cheaper to create animation content.
      • large language models (LLM) like GPT-3, will make it easier to create scripts and stories.
      • image and soon video creation (dall-e, stable diffusion) will make it easy to create unique characters or avatars and weave them into a story.
      1. 1

        Thanks @0xBADCAFE. In case you want to partner is this space send me a DM.

  4. 2

    @0xBADCAFE Thank you for doing this. I have no technical experience, but I'm interested in AI and ML. What are some resources that you would recommend to go from zero to one? Is it even possible for a non tech person to get into this field?

    1. 1

      Hi @ovoviews360,

      Absolutely, the ML space will need all kinds of people helping to build the future.

      With all the auto-ML tools, you don’t need to be an engineer to build something exciting.

      Some good starter resources:

      • YouTube videos covering ML capabilities. I really enjoy 2 Minute Papers.
      • Reddit has great subreddits for all different flavours of ML.
      • Papers & Code for a good overview of the state of art.
      • Andrew Ng’s course has some math but will give you a solid base of ML.
  5. 2

    I'm working on a service that helps developers create Generative AI applications (shameless plug - https://aigur.dev) and I'm looking for common challenges between AI based applications. What did you waste the most time on when working on your products?

    1. 3

      Hi @yairhaimo,

      I wasted the most time on:

      • data cleanup - fixing mislabelled data and cleaning dirty source data.
      • optimisation of models - that didn’t actually provide any substantial improvement in the product. (98.23 -> 99.04)

      AIGUR looks cool! What are you using for auth and usage billing?

  6. 2

    What is your technical background? I am a developer with experience in data analytics, looking to get more into the Ai-SAAS route myself

    1. 3

      We’re likely pretty similar. I’m a developer (CS degree) with an interest in data and some background.

      I’m self taught on the ML side, started 2020 with nearly zero ML knowledge.

      I started with Andrew Ng’s free course. I wanted to do the fast AI course but never found time.

      Then I started building, made a ton of mistakes and learned a LOT.

      Now I can understand all major ML technologies/techniques and have built cutting edge models/products.

      1. 2

        Yea our backgrounds are pretty similar, I'm just super disoriented when it comes to deciding exactly which niche/market gap to pinpoint. There's so many and its overwhelming as hell

        1. 2

          The fact that ML can be applied to effective every part of the business world does make it a little overwhelming.

          I would suggest going with an area you’re very familiar with and/or have worked in.

          I’m writing up a longer answer to the how to pick a niche.

  7. 2

    what are some sub-niches where ml can be used as a base to build a business?

    1. 2

      Hello @Rohit31,

      When it comes to ML businesses, I'm 100% focused on B2B. I think there are some great consumer applications out there but it will be hard for an Indie to take those to market.

      What are some sub-niches? The quick answer to your question is all of them.

      The longer answer is below. Please leave a comment if this is interesting and I can write up a full guide.

      ---

      Quick Guide to Discovering ML Businesses

      1. Find human inefficiency
      2. Check if ML can help
      3. Determine value to company

      Human Inefficiency

      We are bad at so many things. :)

      When I think about ML products, I like to envision all the things humans are already doing that are error-prone, slow, or expensive.

      Why these? Because most applications of B2B ML could be done by humans if time and money were not a concern. But this may change in the future.

      For example: Jasper.ai the SEO content app, if you had a team of 10 experienced SEO writers and someone to manage them you wouldn't need that tool.

      Can ML Help

      This is probably the hardest part. If you're technical you can read papers and see how state-of-the-art algorithms are performing. If you're not, you can look at examples of existing ML and see how they could be applied.

      Example: GPT-3 is already being used to write blog posts, so maybe it could be used to write resumes and cover letters.

      Value to the Company

      This isn't exclusive to ML but is important to try and understand for all projects. Value usually falls into a few different buckets:

      1. Saving the company money
      2. Helping them make more money
      3. Allowing them to do something that was not possible before.

      You can do some quick analysis (see below) to determine value, but the best way is to talk to customers/users.

      Example

      You are renting a car, you stop by the rental agency, and one of the staff members steps outside to inspect the vehicle with you. They mark down all the current card damage on the contract.

      Problem: This takes time to walk to the car and around it, to mark down all the issues. It is also very easy to miss damage, at the beginning or upon return.

      Solution: An application on their smart phone which records video of the car and itemizes existing damage.

      1. We've found a process that is error-prone (missing damage), and slow (1-5 minutes per car).

      2. We can train a model to detect cats 🐈‍⬛, then can we train a model to detect dents and scratches on a car? Probably, yes.

      3. 1 minute per car * 30 cars a day = 0.5 hour. at $40/hr that is a $20/day savings. In addition, there is the $200-500 of lost revenue when damage is missed.

      1. 1

        Thanks, man for such an insightful response

        1. 1

          This was so helpful man, I love this response so much!

  8. 1

    Hey @0xBADCAFE

    "I started with Andrew Ng’s free course. I wanted to do the fast AI course, but never found time. Then, I started building, made a ton of mistakes, and learned a lot."

    I'm the founder of Metana.io, solving this problem in the web3 space & we're the best out there.

    We're planning to introduce a fast-phased AI/ML bootcamp catering to existing software devs. Would love to have a convo with you. I tried looking for your email, but couldn't find it. My email is [email protected]

    ---

    At Metana, all of our bootcamps follow the same format, which consists of a weekly class led by an instructor who is available to answer questions and provide clarification on assignments.

    During the class, you will work independently on the provided study materials and spend most of your time practicing coding or hacking tasks.

    Afterward, you will have a one-on-one session with your instructor to receive feedback on your work.

    The approach at Metana is similar to being part of a small engineering team with a highly experienced engineer as a mentor. Our courses are designed to optimize learning and allow you to continue studying with us for an extended period of time, unlike other bootcamps that may not commit to your growth for as long.

  9. 1

    nice you have achieved the great result for your products.

    What was your customer churn rate? and how do you handle it?

    1. 1

      Hi @buckyjames,

      I have had zero churn in my healthcare ML products but that probably will not last.

      With the other two products, we are around a 5% churn. I treat the churning customer as a learning experience and try to get as much detail on their reasoning.

      We are still early and haven't built a lot of features so the value add is not really there yet.

      1. 1

        Its a great thing you learning from the churning customers most of the founders do not focus on that factor and lost a lot of their revenue.

        You can reduce your customer churn rate with the help of Churnfree a customer retention tool which is helping membership business like yours to reduce their customer churn rate.

        This tool can save up to your 46% churned requests with its help you can get the review from your churned customer why they have left your product, you can try its 14 days free trial to see how much money you can save with its help.

        Hope you will get the best result for your product and you can reduce your churned customers.

  10. 1

    Would you start your 4th one with me? Lol, gratz. I'm curious about the data part. How much data point is okay/good/more than enough. And where to get them?

    1. 3

      Hi @Rusted,

      I try to avoid co-founders as it doubles/triples what you need to earn and 10x your human problems. :-D

      Data can definitely be the hardest part of ML.

      How much data is enough? That depends on the model and business use case:

      • Building a DNN from scratch - you'll need over a 1000, if not, 10K+ elements
      • Fine-tuning a model - 200-500 data elements

      This also depends on quality:

      • Need 99.99% accuracy, you'll need 100K-1M+ elements.
      • 99.9% - tens of thousands
      • Around 99% - you can succeed with a lot less. (1-2k)

      Collecting or getting access to data can be difficult, but here are a few options that can help:

      • Fake-it-till-you-make-it - Can you have a human doing it for now?
      • Synthetic - can you generate data that looks realistic/
      • Buy / Borrow - most companies don't have the skills to turn data into models.
      • FOIA - Request data from local, state, and federal agencies.
      1. 2

        Finally a subject that I can 10x something! We gotta work together now.

        Recently watched a video that someone trained an AI to remove watermarks from stock images (for fun). He did it with 42 not so great samples and I was like what, 42, I wouldn't understand what to do with 42 samples. I kinda enjoy hoarding data just don't know how to process them. Gotta look into this stuff. Borrowing data also sounds interesting. Anyways thanks for the throughout responses.

  11. 1

    How did things go with your SD image creation project? And what did it mainly do?

    And I know this sounds quite vague, but I'd appreciate any feedback you have on our project:

    We're developing an AI API that allows users to run open source models we put on the cloud through an API. So far, we're almost done our Stable Diffusion API that will release soon, since from what we've seen, there's not too many on the market that're a good deal for devs making SD projects (most have monthly sub + pay per use, whereas we're just pay per use).

    We then plan to host BLOOM (open source LLM) with API access in the future as an uncensored alternative of GPT-3/ChatGPT (though with lower quality, since it's pretty hard to beat OpenAI).

    So far, we have a good amount of interested discord users despite having no website or demo: https://discord.gg/DsJXxcMmkC

    Thanks for the AMA btw! I found your insights quite useful.

    1. 1

      @RichardGao,

      Sorry for the delay. We use SD for creating custom images based on keywords which are then weaved into business video content used for promotion and SEO. I’m being a little vague because this product is still in stealth. 🤫

      I think an ML API on-demand could be a great indie business but it is going to be a very competitive space. I can think of 10 companies offering this off the top of my head.

      That isn’t to say you won’t be successful but it will be important to have a competitive advantage in features or models or a specific niche.

      Also, competing on price/pricing structure is a double edged sword. If your competitors charge a monthly fee then they’re earning per customer will likely be higher and LTV. This means that they can spend more on user acquisition or developing features.

      Will having no-monthly fee make up for that financial dis-advantage?

      Whenever possible do not compete on price.

      1. 1

        Thanks for the advice! Most of these SD APIs use the same AI (I say most, but it's really all lol), so unfortunately, price seems to be the only competitive point rather than quality.

        You're advice is useful though, because we're thinking of having more SD models on there than just the original.

        And our BLOOM API's main advantage is being overall than the smaller models like GPT-J and NEOX, while also being unfiltered and available to all countries.

        Very excited about the AI space. Although competitive, it's a growing industry and there seems to be space for everyone.

  12. 1

    How do you find selling into healthcare? I built a bunch of ML models for fun but never imagined selling into healthcare was possible for indie hackers?

    1. 1

      What do your fun ML models do?

      I've stayed away from clinical ML models as that would be VERY hard to take to market without funding. My models are more on the business and planning side.

    2. 1

      Hi @senojretep356,

      Hahaha. Selling into healthcare is an absolute nightmare.

      But it is definitely possible for an indie. Building in public for healthcare isn't really possible as most companies won't work with you if you're too small.

      On the positive side of indie healthcare:

      • Not much competition
      • Minimum deal size is ~$10K ARR
      • Average is ~$20K ARR
      • Churn is effectively zero

      I've been working on 1 deal for 10 months, it will likely close in early 2023 for $160-200k ARR.

      Healthcare is the largest single slice of the US economy. It is HUGE!

  13. 1

    Hi !
    I would like to ask you a business related question. Two questions :
    1 ) Do you often write algos from scratch / from the paper ?
    2 ) How do you build your custom dataset ? Do you go through a lot of manual labeling ?

    Thank you :)

    1. 1

      Hi @valentinFng,

      1. I’ve built a few models from scratch when I wanted to combine model capabilities into a “super model”. Like image classification and text detection all in one.

      I’ve also implemented a few algorithms from scratch from the papers. But this often takes months and has to be worth it. There are so many open source algorithms, you have to be very sure the novel one will be worth the effort.

      1. I’ve done a lot of fine tuning which reduces the size of the dataset. I’ve also tried creating synthetic data from a smaller dataset, this works okay.

      On the healthcare side, I found someone to partner with and built them a custom solution in exchange to their dataset for building my own models.

      I’ve done a fair bit of manual labelling and have outsourced to Upwork for 150-200hrs of labelling @ $6/hr. Well worth the spend.

      Thanks for your great questions!

  14. 1

    This comment was deleted a year ago.

Trending on Indie Hackers
Reaching $100k MRR Organically in 12 months 29 comments Passed $7k 💵 in a month with my boring directory of job boards 15 comments 87.7% of entrepreneurs struggle with at least one mental health issue 14 comments How to Secure #1 on Product Hunt: DO’s and DON'Ts / Experience from PitchBob – AI Pitch Deck Generator & Founders Co-Pilot 11 comments Competing with a substitute? 📌 Here are 4 ad examples you can use [from TOP to BOTTOM of funnel] 10 comments Are you wondering how to gain subscribers to a founder's X account from scratch? 8 comments