PHP, XML & YouTube - Namespaces Within Namespaces
A familiar sight to many people; programmers and commoners alike...
As the title suggests, I have been updating my PHP XML feed from my YouTube Channel to R3DMM. Pretty standard stuff. I originally wrote the brief sections of code required to parse my channel back in 2009. At the time, I only wanted a few bits of data; enough to link to a video and get the thumbnail. With the new R3DMM layout I'm working on (seriously, I've dropped the beta link enough times now) I wanted to take it deeper, bring in view count, likes, duration, stuff like that.
Let me preface this with a short explanation of XML, for those people who don't know what it is.
In a nutshell, XML is a simple way for data to be communicated in a plain, easy to interperate format. You get tags such as <name> which, as logic would suggest, contains a name (be it of a person, a fruit, or whatever). You can then take that tag and... use it.
So, what's the big hubub about? Why the rageface at the head of the post?
Namespaces. Let's assume you have an XML feed which contains two pieces of data:
<name>Bob Holness</name>
<host>Blockbusters</host>
<name>Teal'c</name>
<host>Jaffa</host>
Now, I wouldn't be the first to suggest it, but there is no connection between Bob Holness and Teal'c. One is a gameshow host and the other is an alien (I'll leave that to your discretion). When you have two pieces of data in the same XML feed which have no connection, but share the same tag names, you need to seperate them - with namespaces.
<gs:name>Bob Holness</name>
<gs:host>Blockbusters</host>
<sg:name>Teal'c</name>
<sg:host>Jaffa</host>
There. Bob now has the namespace 'gs' (gameshow) and Teal'c the namespace 'sg' (Stargate). The two can now be told apart with ease as they have different namespaces. So, with this clear system in place, why do you think anyone would do this:
<media:group>
<media:title type='plain'>Blockbusters Game - How To Play</media:title>
<yt:duration seconds='614'/>
</media:group>
<yt:recorded>2012-01-13</yt:recorded>
Who the diddly put a namespace within another, unrelated, namespace?! I'm looking at you, YouTube. Someone decided that it would be just dandy to put the 'yt' namespace within the 'media' namespace, despite the fact that 'yt' also appears as its own entity!
//Load Media RSS
$media = $entry->children($namespaces['media']);
echo "title: ".$media->group->title."<br/>";
echo "dur: ".$media->group->children($namespaces['yt'])->duration->attributes()->seconds;
//Load YT RSS
$yt = $entry->children($namespaces['yt']);
echo "Views: ".$yt->statistics->attributes()->viewCount."<br/>";
Why the rage? Well, as you can see, it means that I have to put a children($namespace) definition within another definition. Rather than having my 'media' namespaces in one area and my 'yt' namespaces in their own, I have one, ugly, sticky-out line of code that uncomfortably rests between the two. Seriously. If you're going to use a schema to seperate different types of data, don't mix them back up again!
Furthermore, don't get me started on the extremely poor documentation that surrounds the simplexml element. Just don't.