I'm making the next big search engine using PHP and mySQL
1. Handle terabytes of data
2. Understand a lot of different dialects, misinterpretations and misspellings of HTML code found in actual web pages on the internet.
3. Make ranking algorithms that find the right pages for a search query.
I'm currently in the proccess of moving the database from my shared hosting account with 2GB storage to a VPS server with 16GB storage. This will if everything goes well increase the index size from 50,000 pages to 250,000 pages
You can find it here: SecretSearchEngineLabs.com
