How to host millions of files - need advice For a new project of mine I am using wordpress and a self written crawler to autogenerate image posts from various sources. I wrote a script that automatically creates a draft in wordpress and puts the image into the database and media folder. The problem I am facing is that if I leave everything automated I will generate roughly over 3 million posts [with 3 million pictures in their original resolution]. That will create a lot of inodes so obviously 'normal' hosts are a no-go. ++++++++++++++ list of top cheapest host http://Listfreetop.pw Top 200 best traffic exchange sites http://Listfreetop.pw/surf free link exchange sites list http://Listfreetop.pw/links list of top ptc sites list of top ptp sites Listfreetop.pw Listfreetop.pw +++++++++++++++ I don't feel good about the idea to store the images on S3 in the very beginning (because I have no clue what the datatransfers will be in terms of amount), however I like to use cloudflare to speed things up on the front end. Being and image/photo website it's pretty obvious that bandwith should be unlimited and storage capacity should be several terrabytes in the beginning). I also need a host that provides support in terms of DMCA - e.g. I would comply with takedown requests, but I don't want the host to kill my account the first time they receive a takedown notice. A DMCA ignore host is not required but preffered in order to have some kind of piece of mind. The content itself is drawing/design related, e.g. think of deviantart as a good example of what I am trying to accomplish. To be honest, I don't know where to start - I do have a host right now that supports 250k inodes, but that will be filled within days of scraping. My host offered me to switch to dedicated hosting, however I don't feel ready yet to shell out hundrets of dollars per month befor the project is even known. I'd be happy you guys can recommend next steps - i've been a long-time lurker in this community and I appreciate the way you help out in those situations. I would suggest you do need dedicated, or at the very least VPS with a large amount of storage space, but that is few and far between as a typical offering. You say you want unlimited bandwidth, but in reality unlimited does not exist and so you will likely come up against a fair use policy - approximately how much do you envisage requiring? Stratagem Hosting - Lightspeed fast, forever. \Well thats a good question - I have no idea. I am going to do some marketing camplaigns on facebook and other social media once 20% is up. Since it's a niche site with just a few competitors I expect the bounce rate to be low and people browsing/saving pictures for hours at a time. I just want to be prepared for any type of traffic explosion, in case some post get's viral - so I rather want to prepare beforehand. However I also cannot justify shedding out 300$ a month for a low-end dedicated, cause if the idea doesn't work out I'd have wasted that cash. Basically I see two options for now 1) Get an expensive dedicated package, prepare everything and run it on autopilot for the next years 2) Stat with a high-end shared host, then move to dedicated if "it works out". The problems I have with #2 is possible downtime/server load in case of high traffic, daily monitoring and I cannot prepare all the content as a draft, I can prepare only a percentage. It makes things complicated longterm. gotclicks2.com hosting on aws u hostel bangkok hosting vps downlinemaxx.com frameptp.com euro-barre.fr make money 7 days a week free forum hosting companies hosting your own vpn It certainly sounds like you'll need quite a chunk then. Are you looking for a managed dedicated server, or are you going to be administering the server yourself as if the latter, they can certainly be had for cheaper than $300 per month. Stratagem Hosting - Lightspeed fast, forever. It certainly sounds like you'll need quite a chunk then. Are you looking for a managed dedicated server, or are you going to be administering the server yourself as if the latter, they can certainly be had for cheaper than $300 per month. Unfortunatly I have no idea how to manage or setup a server, I can just handle the basics such as cpanel etc. I've just had a look at the hosting offers for dedicated servers here in the forum - There are some interesting deals on unmetered traffic with several tb space. However, with so many people offering more or less the same at the same price range it's a nightmare to make a choice. As you're going to be looking for managed then, I would suggest looking through the offers as you have and making a short list, with a handful of hosts or so, and then discuss with them what you're planning to do (via live chat, ticket, phone, etc) and then in doing so, you'll be able to get a feel for what they can offer and how they will support you going forward. Stratagem Hosting - Lightspeed fast, forever. High inode usage (millions of files) can cause very high disk input/output. I would suggest using a dedicated server with SSD NVMe. Fast Host - Cloud Hosting w/ LiteSpeed & Imunify360 Security Backup Storage - Backup Storage w/ SFTP, SSH & cPanel Access I believe this thread went a bit off track, I am still looking for suggestions that are not mentioned - e.g. how to gradually do this project without investing too much in the beginning - the biggest problem being the idones for scheduled posts in wordpress. However if we talk about dedicated, I am open to recommendations from people here. I kinda narrowed it down what I need in the beginning - managed d. hosting w/ root access [in case i outsource server management to someone else] - 24/7 support - unmetered bandwith [fair use] - fast, it seems like at least 1GBps is standard - offshore location, i'd prefer asia because traffic will be 30% japan - good history on this forum, at least 5+ years in business, stellar reputation - privacy (!) As for the server specs I have no clue what I actually need. But it should be able to handle ~20k simultaneous users peak in case things get viral. Please correct me if thats impossible using a light wp+cloudflare. Yes, any good VPS can withstand such a load! How is this like DeviantArt? Images at DeviantArt are uploaded by members, are they not? This does not appear to be what you are planning to do. Will you have the copyright owners' permission to host their images? How is this like DeviantArt? Images at DeviantArt are uploaded by members, are they not? This does not appear to be what you are planning to do. Will you have the copyright owners' permission to host their images? I mentioned DeviantArt in order to give people an idea about the scale of the project, not the way it works. My project does not infright anyones copyright as I am making use of the 'fair right' clause (section 107/108). In addition to that the original authors name, backling and other relevant information is published. For further information have a look at this: https://en.wikipedia.org/wiki/Copyri...ng_and_framing Let's not deviate the topic and make this about copyright, I did my homework ; ) Yes, any good VPS can withstand such a load! So VPS is also an option? What min/max configuration should I be looking for? If a VPS can pull this off then thats good news pricewise. I agree with @HostXNow_Chris and Ben, a dedicated server with NVMe SSDs is a must have for this project if you're going to have millions of inodes. Most VPS and shared hosts aren't going to allow you to use millions of inodes or a lot of CPU and RAM and they'll probably shut your site down as soon as it starts using significant resources. Many VPS providers also impose IO usage limits. | AMD Epyc | AMD Ryzen 3/5/7/9 | Intel i3/i5/i7/i9 | Intel Xeon | ARM devices | | Custom Server Builds | Server repair | Server Upgrades | System and Network admin | Many VPS providers also impose IO usage limits. Excellent point and I can't echo that enough. For example, the amount of times I've tried to use VPS for backup purposes, but the disk IO was always throttled that much I ended up using dedicated server or backup account where a highish number of resources like CPU and disk IO are stated. Okay, no matter if VPS, Dedicated, even shared - would it be theoretically possible setup one main server for the backend and then slowly add more vps servers acting as image cdn's while the site grows? I notice that a large of large sites are hosting the images at img.site.com but also have img1.site.com and img2.site.com etc. Has anyone experience with this? And is it good for SEO? Spreading the load out between many VPSes or just shared accounts would only be possible if your scripts support such a setup. I don't think SEO would be affected by using different sub domains. You could also use a dispatching system where by the image is puled from "images.example.com" and a script is ran for any request there that finds the image in the image database that lists which server it is at, and then returns that image in the request without any redirection. -Steven | u2-web@Cooini, LLC - Business Shared Hosting | Isolate sites with Webspaces | Site Builder | PHP-FPM | MariaDB WHMCS Modules: Staff Knowledgebase | Custom Modules and Hooks Sounds like you need to just go with a cloud provider so you only pay for what you use. If you think bandwidth will be an issue insure people are not uploading massive files. You will more then likely need to hire a systems engineer or get a managed hosting provider to help you. Normally you would set your web frontend up to run behind a load balance to handle the load and enable you to scale resources horizontally (add more servers) instead of vertically (add more resources to a single server). You can normally do something similar to the following using AWS for an example: Primary Site: Load Balancer for handling traffic from the internet: Setup two Web Servers to start with scaling rules in place to add another one if necessary Setup an S3 bucket to handle the storage of images and duplication of that data to other availability zones. Use RDS for MySQL to have AWS manage the scaling of your database and replication to another site. Setup Glacier for regular backups of your site on a regular basis. The problem you are going to run into with getting a dedicated server that might meet your needs is the scalability problem and over paying for what you need when you do not need it. You can choose a similar setup from many of the other providers out there like Azure, Google, etc. but if your site is using a nice chunk of resources you really do need to give it the proper infrastructure. Now if you wanted to go old school you would need to find a host that would set you up on their load balancer, setup physical servers to host you or VPS servers at least for the web servers, then you could store the files you need in block storage vs on your VPS/dedicated server. Leaving your VPS/dedicated server to just to processing of requests for images. So VPS is also an option? What min/max configuration should I be looking for? If a VPS can pull this off then thats good news pricewise. I have no idea what kind of adult libations @alexhost1 has been imbibing but it's safe to say that the quantity would be "a lot" and the potency would flatten a horse. Your design consideration of "millions of inodes" means a VPS is a non-starter. If you owned your own hardware and were handling storage separate of front-end requests, sure, VPS's could handle the FE stuff. As an all-in-one solution however it's absolutely out of the question. All that said, you need to learn yourself how the Big Boys do this kind of stuff otherwise you risk engineering yourself right into a corner. The nice thing is those Big Boys aren't hiding how they do things... Start here: http://highscalability.com/youtube-architecture Daddy? How was vi born? Well son, first cat and echo fell in love...