How robust against spam should side projects be?

11 points by cyjackx 2 days ago | 10 comments

I am just getting into making my first project as a hobbyist, and it involves user image uploads. I can imagine a whole host of issues that platforms deal with, from spam to AI to nsfw. Integrating captchas and image analyzers, etc, all feels like overkill. Leave it be until I need it seems like the right answer, cart before the horse assuming it'll even get any traffic or attention, but it's a silly enough joke idea that I want to see if it gets shared around a bit.

andyish a day ago | next |

All systems should have an upper limit harder to them, it's not that hard to implement and it's just an ultimate safeguard.

In the backend, limit image size to say 50mb, the number of images a user can upload to 1k and the amount of users that can be signed up to 10k. Then you have some numbers which will let you say "regardless of what happens, I've got a hard cap of 1 TB of stored data".

You don't have to show the user their usage and you can always lift the caps with a deployment. What you don't want to happen is a side-project costs you tens of thousands because one fool thought it was funny to upload a million hi-res pictures of Rick Astley while you were asleep.

Slightly more work would be to limit the number of images that a user can upload in 15 minutes. Or depending on what your signup flow is making a user confirm their email but putting a short delay in sending the email out.

cyjackx 19 hours ago | root | parent |

Yeah, for me <5MB per pic uploaded (and then compressed) and 5-10 images limit per user should be sufficient to start. That, or some platform specific things like you only get to 1 upload per 10 votes...

Good reminder to set usage caps with a deployment

joewils 2 days ago | prev | next |

I've found most spam comes from bots operating from certain countries. Filtering traffic by IP address will stop a lot of spammy traffic. I've used a combination of GeoIP and IP address catagorization to limit the impact of bots on my side-projects.

Bot Protection: https://github.com/growlfm/ipcat

GeoIP Evaluation: https://dev.maxmind.com/geoip/geolite2-free-geolocation-data...

You can dig deeper using dedicated services for URL and image inspection.

Image Evaluation: https://cloud.google.com/vision/docs/detecting-safe-search

URL Evaluation: https://cloud.google.com/web-risk/docs/overview

JoeNuts 2 days ago | prev | next |

Personally I've never had real people use any of my personal projects, just some bots trying to break into the server occasionally with some generic scripts. So you'll probably be fine

codingdave 2 days ago | prev | next |

> Leave it be until I need it seems like the right answer

That might be correct, but knowing what moves you intend to make if a problem occurs is wise. For example, suppose you wake up one morning to find that your site was flooded overnight with content that is illegal. If you already had a plan in place for problematic content, you execute the plan, instead of being in a high stress situation and being forced to research and make quick decisions on the fly.

cyjackx 2 days ago | root | parent |

In theory I start integrating some safe image api or something, but I'm not seasoned enough to know if scrubbing the data away manually then is going to be easy enough. Right now I use supabase email auth , and I figure that cuts things down somewhat.

And if I am to have a plan, why not just implement from the start?

codingdave 2 days ago | root | parent |

Risk management is not (always) about prevention as much as it is about reaction and mitigation.

Most nefarious attacks on sites/apps are occasional or one-time things. As an example, I used to work on a site that would get DDOSed a few times a year. I'm not sure why we were targeted, but rather than move our entire weird old legacy infrastructure to a vendor who could mitigate DDOS attacks, we had standard actions to take: Call our server dude, roll traffic to the backup data center, id the IPs at fault, add them to our block list, inform partners and customers to let us know if the new IP blocks affected them.

It was an annoyance, but not a disaster. That is the level of preparation you want - enough to just be annoyed when bad things happen, not demolished.

That should not stop you from prevention, either, of course - if you want to be proactive about such things, go for it.

brudgers 2 days ago | prev | next |

it involves user image uploads

As soon as the people you don’t want to use your system learn about your system, they will try to use your system.

dieselgate 2 days ago | prev | next |

Maybe just a hidden form field with “are_you_a_bot” characteristics if necessary. Otherwise things like sql injection prevention may come out of the box with frameworks like rails