Download!

Something came up at work where one of our larger partners would like to get our schedule database in some sort of XML so they can put our schedule on their site.  Great!  Piece of cake to do, plus more sales for us.  Well, now they want all the schedule data from all the centers, world-wide I assume.  Now, our corporate HQ hold the data and there is no direct way to get at it.  So what I have is a little command prompt app that simply hits their public website and strips off the classes and dates for our schedule.  Best I can do.  Requests for an XML Web Service or even a CSV file have gone unanswered, so this is what I was left with doing.

Now, I need to multiply that by 300 centers world wide.  Originally we were looking at 5mb for just our site (there is a lot of "filler" text per page), now we are looking at ~1.5gb downloaded.  Woohoo!  I wonder if I keep sucking their bandwidth will they relent and create a web service.  It's not like this isn't data that is already public or anything.

Also, the addition of centers world wide kinda presented a problem to our database too.  I had to basically create a new table to hold the course dates with one extra field, the Location.  I dropped the old table, and created a view that returned the "old" table with just our records, keeping our web site running flawlessly.  Once I get the data downloaded I'll see what kinda boast an indexed view on that table will produce.

So, I guess the question is if I should download all the dates nightly or weekly.  If this thing completes in under 3hrs, I'm going nightly until they fold.  Of course, knowing corporate, they won't even notice 1.5gb of traffic nightly coming from the same IP.

No Comments