Create a Download Site Using XPATH and PHP
- List Elements in HTML Document by Tag Name Using XPATH and PHP
- List All Elements in HTML Document by Tag Name Using XPATH and PHP
- List Specified Elements in XML Document by Tag Name Using XPATH and PHP
- List Urls in XML Sitemap by Tag Name Using XPATH and PHP
- List Elements in XML Document by Tag Name Using XPATH Query and PHP
- List Child Nodes of Element in XML Document Using XPATH Query and PHP
- List Urls in XML Sitemap Using XPATH Query and registerNamespace and PHP
- List a Website's Audio Links Alphabetically Using XPATH and PHP
- List a Website's Video Links Alphabetically Using XPATH and PHP
- Grab Web Page Links and Video Links and Audio Links from Web Page
- List a Website's Images Alphabetically Using XPATH and PHP
- List a Website's External Links Alphabetically Using XPATH and PHP
- List a Website's Page Urls Alphabetically Using XPATH and PHP
- List a Website's Page Descriptions Alphabetically Using XPATH and PHP
- List a Website's Page Titles Alphabetically Using XPATH and PHP
- Get Links from Web Page Using XPATH and PHP
- Count and Alphabetize Words on a Web Page
- Search Website without Indexing Using XPATH and PHP
- Free Website Indexing Script Using XPATH and PHP
- Free Website Search Script Using PHP
- Free Website Search Script and Tutorial
This page will help you with download site creation. In this case, we've used a sample file that is a website PAD file which we generate with PADGen, and which has the purpose of being an easy way to let download sites know what software you have, where it is, what it costs, what its screenshot it, where its download file (zip or exe) is, and where there's more product info. http://www.theliquidateher.com/pad_file.xml is a real PAD file to look at to see the tags and info one can find in such files. Check it out.
To have a download site, you must first learn how to read PAD files. The script below shows how. But we should point out the critical tags to read. Once you read the parts of submitted PAD files you deem important, you need to store the info in a MySQL database. Here are the tags we consider critical:
- <Program_Version> (a number)
- <Program_Type> (demo, freeware, shareware, etc.)
- <Char_Desc_80> (short description)
- <Char_Desc_2000> (long description)
- <Application_XML_File_URL> (PAD file url)
- <Primary_Download_URL> (.zip or .exe)
Using the PAD reading script ideas below, you can read submitted PAD files and put them into your MySQL database and then, by reading your db, display the submitted program's info in the proper category of your download website. You'll want to be careful about security, running submitted PAD files through a virus/spyware checker. But in case you feel lost, let's start at the beginning.
What's a download site? A site like download.com where you can go to find all kinds of freeware and shareware to download and use, like Ez-Architect, for example. A more complete site (like download.com) should have user reviews, ads, and editors that look over PAD submissions and check to see they are legit and not porn related—or anything else unacceptable. There should be lots of categories and subcategories so it's easy for people to find things. A few sites even host the download files (.zip or .exe) themselves, but this is rare. Most sites merely store links to everything in their database—the ones submitted in the PAD file—and display these as live links on their site. Don't fret if you cannot create the next download.com. Just be happy if you can create a functional download site.
What are most download sites like? The good news is that there are plenty of them and some of them are very professional, responsive, fair, and well programmed. The average site seems to store 30,000 to 90,000 software entries. Most sites accept PAD file submissions. The ones that don't aren't worth wasting time on, if you are a software owner who submits PAD files. We've submitted to over 300 different sites. Here is the bad news:
Over half will be dead within a couple of years. Most fail. Many let their sites fill with data until the host blocks new submissions. An AMAZING number of download sites have broken submitters. Some sites show PHP errors or MySQL errors or Not Authorized errors when you try to submit. Few sites have decent searches. Worse, even when a product is on the site, many search functions on download sites fail to find the product. Over half of the sites do not even have search functions. Incredibly enough, some sites do not even have categories! How could anyone possibly use a download site with no search function or category navigation? When you ask yourself why these people bothered to make a site, the answer is obvious. They are spammers who merely wanted to use your submitted email address to send spam to or to sell to spammer companies. And it gets even worse: Some have the nerve to try to make you PAY for the questionable privilege of submitting to them. Some say you MUST put a link to their site on your site or they will dump your submission in the sewer. Some say you must join their affiliate program and give them a piece of every sale. But the very worst ones will lead you to enter a whole lot of info manually and then when you hit Submit, they ask for a credit card and some money which they hope you are ready to pay because otherwise all the time you just invested on them will be wasted. A few say they're no longer accepting submissions. A few dump all submissions and are really only trying to sell you one software program—theirs! And a few are not download sites—they are simply pages full of ads. Google—are you listening? Most of the download sites that come up in searches are inferior, spammers, bait-and-switchers, ads, phonies, broken, badly programmed, or not even there. Many of the ones that come up in download site lists are also inferior, spammers, bait-and-switchers, ads, phonies, broken, badly programmed, or not even there.
But happily, there are plenty of download sites worthy of the name, and some of them are very professional, responsive, fair, and well programmed. So how do you make your own site? The same way an elephant makes love to a mouse—VERY carefully! We mentioned above the minimum requirements. If you're planning to make a spam site, please just don't. We don't appreciate such things one little bit. We software users and sellers have learned, by this time, that there are no pots of gold at the end of the Nigerian Princess rainbow—nor any princesses, for that matter. And the guy in a foreign country that wishes to help us because our bank account has a critical problem and he needs us to hurry and send him the account info so he can correct the problem before the account is frozen or worse—the only help he wants to give is helping himself to our money. In other words, either make a decent site or please refrain from littering the Net with more crappy, deceptive, useless spam traps. But, on the other hand, if you are legitimately interested in making a download site, read on:
You need to allow users to click on categories, at least, but having subcategories in the categories is better. We have a sample CMS system for a website directory that has categories and the code is here at website directory code. There are other administrative functions needed, as you can see at Website Directory Content Management System, such as adding, deleting, or viewing categories or entries.
The script below uses the PHP DOM extension and PHP 5. The DOM extension is enabled by default in most PHP installations, so the following should work fine—it does for us. The DOM extension allows you to operate on XML documents through the DOM API with PHP 5. It supports XPATH 1.0, which this script uses extensively. XPATH has been around awhile. What is it? XPath is a syntax for defining parts of an XML document (or an HTML or XHTML one). It uses path expressions to navigate in documents. It contains a library of standard functions.
The DOMXPath class has the DOMDocument property and several very useful methods: DOMXPath::__construct, DOMXPath::evaluate (which evaluates the given XPath expression and returns a typed result if possible or a DOMNodeList containing all nodes matching the given XPath expression), DOMXPath::query (which evaluates and executes the given XPath expression and returns a DOMNodeList containing all nodes matching the given XPath expression), DOMXPath::registerNamespace (which is necessary to use XPath to handle documents which have default namespaces described in the xmlns declaration which in the case of a sitemap is in the urlset tag), and DOMXPath::registerPhpFunctions. Most XML files seem to have no xmlns declaration (e.g., PAD files), therefore needing no namespace registration.
Below, we perform a few tasks, first with DOM only and no XPath. Then we do the same thing using XPath. You only need one of these methods to read PAD files. Choose one. XPATH is more fun, but DOM-only methods can be more straightforward for some tasks. The getElementsByTagName() method seems more straightforward for this task of listing Specified Elements in XML Document by Tag Name. But keep in mind that XPath can do a lot that DOMDocument objects alone could never do. First the non-XPath version:
The new DOMDocument object is created so we can use the Document Object Model to get info from the file. We load in the XML file with the load method. Then we use the getElementsByTagName() method and ->item(0)->nodeValue to get various tag contents as strings that we can echo (since raw DOM objects do not echo until you get their value as a string since echo only outputs strings), which we then proceed to do.
Now the XPath version: A new DOMDocument object is created because for XPATH use, you have to create a DomDocument object.
The $dom->load('http://www.theliquidateher.com/pad_file.xml') code loads $dom as it gets a PAD file's contents into the DOM object. Next we use $xpath = new DOMXPath($dom) to create a DOMXPath object with the file contents inside. Next we define the $info array. It is not needed, but it's a convenient place to store XML document info if you need to. Now we perform an XPath Query going after all elements with File_Info as the parent tag (which is why there is /* at the end of the XPath expression in the parameter of the query). The results are a DOMNodeList we can—and do—loop through, in this case putting the node values into the $info array as well as echoing each result. Then we get a single element, using the // XPath syntax to depict a desire to select nodes in the document that match the selection no matter where they are. We get the node value of the results of the query and echo it to the screen.
As you will see in List Urls in XML Sitemap by Tag Name Using XPATH and PHP, you can also loop through results you get when using the getElementsByTagName() method, since this method returns a new instance of class DOMNodeList containing the elements with a given tag name. These are easy to loop through.
For the DOM-only version, there's no need for $xpath = new DOMXPath($doc), which creates an XPath object to use with the getElementsByTagName() method, because you do not need XPath for a getElementsByTagName method. But for $xpath->query() methods, XPath is essential.
If an XPATH expression or non-XPATH expression returns a node set, you will get a DOMNodeList which can be looped through to get values. In the non-XPATH version below, forget the loop and just get the node values of four different tags found in the file. (Although we could have used the method that our XPATH example did: going after all elements with File_Info as the parent tag; but since there were only three tags with that parent, the way we illustrated was fine.) This getting the node values of tags, one at a time, is good if there are no tags with the same tag name or few child tags under any one parent tag, as just discussed. But, as in List Urls in XML Sitemap by Tag Name Using XPATH and PHP, it is often wise to loop through elements with a certain tag name. This is especially great if there are many elements with the same tag name.
In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document nodes. You can get more information on the syntax to use in XPath expressions in the W3Schools XPath expression page.
The ability to read PAD files so you can stick their info into a MySQL database, along with the ability to read your db and display its info on the screen, are critical skills for download site building. This site has the needed info for the various tasks you need to undertake and accomplish the download site challenge. Above all, have fun!
$doc = new DOMDocument;
$fileinfo1 = $doc->getElementsByTagName('File_Size_Bytes')->item(0)->nodeValue;
$fileinfo2 = $doc->getElementsByTagName('File_Size_K')->item(0)->nodeValue;
$fileinfo3 = $doc->getElementsByTagName('File_Size_MB')->item(0)->nodeValue;
$description = $doc->getElementsByTagName('Char_Desc_2000')->item(0)->nodeValue;
$dom = new DOMDocument;
$xpath = new DOMXPath($dom);
$info = array();
$infoNodes = $xpath->query('//File_Info/*');
$info[$i] = $infoNodes->item($i)->nodeValue;