A question on automating database population for you server-side experts.
Let’s say you have a web site that you’ve been updating manually for a few years. Let’s also say that you’re sick to death of doing it this way, have finally taken the steps necessary to automate this thankless task, and now it’s finally time to throw all that manually-input data into a database. For the sake of argument, let’s also assume that adding the 700+ items by hand just isn’t going to happen.
So then my question to you is, can you see any way of taking multiple pages of static, well-formed (and consistent) HTML like, say, this, and getting it to automatically post to a form that looks, well, like this?
I’m sure to some out there, it’s the easiest thing in the world. To me, however, it’s not. So, what can be done? My preference would be a single PHP file that crawls the various category pages (and the sub-pages they link to) under the ‘All Categories’ heading on the ‘All Designs’ page and either a) posts it to the form (assuming the example values aren’t there in the final form), or my preference, b) stores it in logically-named PHP arrays (following the naming convention in the example function below) so that I can bypass the form completely.
Here are a few more supporting pieces of information that might be relevant to this task:
Not all fields in the new form are represented in the static data. For example, currently each submission only receives one category, whereas multiple categories exist on the form. So the script should detect which category the static data is coming from, assume a corresponding entry is in the list of categories (even though they aren’t at the moment), and assign accordingly. Multiple categories can always be re-assigned later.
Fields which are considered absolutely essential/cannot be discarded/must have a value are: Name, URL (when it exists), CSS File, Submission Title, Category, Official Number (where it exists). Fields which exist in the database for other use, but don’t appear anywhere in the static data (and therefore can be ignored) are: E-Mail Address, Zip File, Windows Browsers, Mac Browsers, Comments. See note about submission/publication dates below.
- The order in which the submissions are displayed in the static files is important, in that the further along in the list they are, the earlier they were submitted, and therefore the earlier they should be posted. I’ll probably have to do some manual jiggering of the submission/publishing dates because I’ve pretty much discarded that data all along, so as much as possible, the script should try to preserve this implicit chronology. The more relevant of the two is publishing date, as it’s what will ultimately determine listing order on the site.
The form flattens out the data quite a bit; in reality, this data spans multiple tables in the database. This shouldn’t matter, but in case you think better this way, it might be relevant to know that the form itself basically posts to a PHP function:
save_submission($subName, $subEmail, $subUrl, $subNation, $subTitle, $subCssfile, $subZipfile, $subWindows, $subMac, $subNotes, $subStatus, $subDate, $pubDate, $subOfficialNumber, "populate");
But a separate category handler function also needs to be called:
…where $subId is the submission id just created, and $subCategories is an array of the values selected.