Convert zap2it TV listings from .html to XMLTV or XTVD .xml

zap2xml is a small and fast command line script for Windows/Linux/OSX/* that connects to a Zap2it.com TV Listings account, downloads the tv listings grid data and converts the .html to XMLTV formatted .xml or XTVD formatted .xml

(nb: the old labs.zap2it.com shut down, but tvlistings.zap2it.com continues...)

Setup:

  1. Get a free Zap2it.com TV Listings account.
  2. Set Zap2it.com preferences:
    1. Check "Show six hour grid" option.
    2. Check "Show only my favorites in the grid" option.
    3. Select all channels you want in your grid by adding them to the "My Favorites" column.
  3. Run zap2xml with the userEmail and password options of your account.
  4. Optionally set up a cron job to run it every day.

zap2xml options:

  • -u <username>
  • -p <password>
  • -d <# of days> (default = 7)
  • -n <# of no-cache days> (from end) (default = 0)
  • -N <# of no-cache days> (from start) (default = 0)
  • -s <start day offset> (default = 0)
  • -o <output xml filename> (default = "xmltv.xml")
  • -c <cacheDirectory> (default = "cache")
  • -l <lang> (default = "en")
  • -i <iconDirectory> (default = don't download icons)
  • -t <trailerDirectory> (default = don't download trailers)
  • -x = output XTVD xml file format (default = XMLTV)
  • -g = use GMT when retrieving data
  • -q = quiet (no status output)
  • -r <# of connection retries before failure> (default = 3, max 20)
  • -e = encode entities (html special characters like accents)
  • -E "amp apos quot lt gt" = selectively encode standard XML entities
  • -F = output channel names first (rather than "number name")
  • -O = use old tv_grab_na style channel ids (C###nnnn.zap2it.com)
  • -A "new live" = append " *" to program titles that are "new" and/or "live"
  • -U = UTF-8 encoding (default = "ISO-8859-1")
  • -L = output "<live />" tag (not part of xmltv.dtd)
  • -T = don't cache files containing programs with "To Be Announced" titles
  • -P <http://proxyhost:port> = to use an http proxy
  • -C <configuration file> (default = "~/.zap2xmlrc")

Example 1: "zap2xml -u zap2xmluser@email.xx -p password"

Example 2: "zap2xml -u zap2xmluser@email.xx -p password -d 10 -o myfile.xml -i icons"

To minimize web traffic zap2xml keeps all downloaded .html files in a cache subdirectory off of the directory from which zap2xml is run. It deletes old .html files in this directory based on last access time. (if last access > days + 2)

zap2xml is written in perl (which is also packaged as perl for windows)

Downloads:

History:

  • 2007
    • Initial Release
    • Added -l and -e options
    • Added Compress::Zlib requirement (without Zlib you get 0 byte cache files)
    • GZip cached files to save hd space
    • Added -i <dir> option to download channel icons to a directory (they are copied to ###.gif as well)
    • Added -? usage instructions, updated dd_prog_id to (10.4) format from (8.4)
    • Removed brackets around movie_year, encode by default the 5 XML standard entities
    • Added -E option (-E "" disables encoding of the standard 5, -E "apos quot" only does those 2, etc)
    • Added decoding for utf8 characters, changed order of date field (passes dtd validation)
    • Added -U option
    • Check for program descriptions in p tags as well as span tags
    • Fix check for HD icon, set binmode when outputting UTF-8 file
    • Add -q and -s options, and output status messages to STDERR so "-o - > xmltv.xml" can work
    • Support an optional key=value text config file (~/.zap2xmlrc or specify via -C option)
    • Added -L option
    • Check for available "HD" and "High Definition" tags (video-hd, video-ahd)
    • Remove expired programs (old/already run programs with no time data from zap2it)
    • Added -A option to append an asterisk to new and/or live program names (-A "new" or -A "live" or -A "new live")
    • Use "USERPROFILE" as default home directory if "HOME" is not set (Windows)
    • Added -r option (retries= in config file) to specify # of zap2it connection retries before failure
    • Added -n option (ncdays= in config file) to specify # of days to ignore the cache. Example: If you use -d 5 to download 5 days of data with -n 2, it will not use the cache for the last 2 days. (-d 7 -n 7 will not use the cache at all [but will put more of a burden on the zap2it website]).
    • Added -P option (proxy= in config file) to specify an optional http proxy
    • Added -T option
    • Added -O option, changed default channel IDs to match xmltv's (station # rather than channel #, probably should have always been this way)
    • Added -N option (ncsdays= in config file) (same as -n, but count days from start rather than end)
    • Added -F option, strip trailing whitespace from options in config file (for Windows)
    • Added -t option (trailer= in config file) - Trailers are big. Only set this option if you really want them.
    • Strip "sponsored" from genre
    • Added -g option, added originalAirDate to <date> field for programs that are marked Live or New
    • Added -x option (outformat=xtvd in config file) - This outputs an XTVD formatted .xml file instead of XMLTV. (lineuptype/lineupname/lineuplocation are parsed from the source .html, or you can specify these in the config file)
    • Added previously-shown tag, update video tag to match XMLTV
    • Copy channel icons to "# NAME.gif" (as well as "#.gif")
    • Handle rare time when channel # is missing on zap2it web page
    • Channel links no longer contain postalcode/lineupIds, added those options to the user definable config file (xtvd output format)
  • 2008-02-15
    • Updated parser for zap2it's new "new/live/hd" tags
  • 2008-03-11
    • Updated program description cookie to match zap2it's new one
  • 2008-05-25
    • Updated to handle zap2it's latest website changes
  • 2008-05-29
    • Handle channel numbers with "." in them
  • 2008-06-26
    • Updated to handle zap2it's latest website changes

Config file options (~/.zap2xmlrc):

  • start=0
  • days=7
  • ncdays=0
  • ncsdays=0
  • retries=3
  • user=user
  • pass=password
  • cache=cache
  • icon=icons
  • trailer=trailers
  • lang=en
  • proxy=http://localhost:8080
  • outfile=xmltv.xml
  • outformat=xmltv (or 'xtvd')
  • lineuptype=type (xtvd only - Cable/CableDigital/Satellite/LocalBroadcast)
  • lineupname=name (xtvd only)
  • lineuplocation=location (xtvd only)
  • lineupid=X:80000
  • postalcode=01010

Comments:

  • If you have improvements please email them to: zap2xml (at) gmail.com