FeedWordPress ============= * Author: [Charles Johnson](http://radgeek.com/contact) * Version: 0.97 * Project URI: * License: GPL 2. See License below for copyright jots and tittles. Introduction ------------ FeedWordPress is an Atom/RSS aggregator for WordPress 1.5. It syndicates content from newsfeeds that you choose into your WordPress blog; if you syndicate several newsfeeds then you can use WordPress's posts database and templating engine as the back-end of an aggregation ("planet") website. I originally developed it because I needed a more flexible replacement for [Planet][] to use at [Feminist Blogs][]. [Planet]: http://www.planetplanet.org/ "Planet Planet" [Feminist Blogs]: http://www.feministblogs.org/ FeedWordPress is designed with flexibility, ease of use, and ease of configuration in mind. You'll need a working installation of [WordPress 1.5][] and FTP or SFTP access to your web host. The ability to create cron jobs on your web host would be very helpful but it's not absolutely necessary. You *don't* need to tweak any plain-text configuration files and you *don't* need shell access to your web host to make it work. (Although, I should point out, web hosts that *don't* offer shell access are *bad web hosts*.) [WordPress 1.5]: http://wordpress.org/development/2005/02/strayhorn/ Installation ------------ ### Requirements ### To use version 0.97 of FeedWordPress, you will need: 1. an installed configured copy of WordPress 1.5.x. (It *won't work* with WP 1.2 or WP 1.6 development builds.) 2. FTP or SFTP access to your web host And you'll probably also want to have either: 1. the ability to create cron jobs on your web host, or at least 2. a computer of your own and always-on Internet access ### Installation ### #### Upgrades #### To *upgrade* an existing installation of FeedWordPress to version 0.97: 1. Download the FeedWordPress archive in zip or gzipped tar format and extract the files on your computer. Replace your existing FeedWordPress files with the new files. Be sure to upgrade `rss-functions.php` if you use the optional MagpieRSS upgrade, or don't use it yet but do want to syndicate Atom 1.0 feeds. 2. **Immediately** log in to the WordPress Dashboard, and go to Options --> Syndicated. Follow the directions to launch the database upgrade procedure. The new version of FeedWordPress incorporates some long-needed improvements, but old meta-data needs to be updated to prevent duplicate posts and other possible maladies. If you're upgrading an existing installation, updates and FeedWordPress template functions *will not work* until you've done the upgrade. 3. Take a coffee break while the upgrade runs. It should, hopefully, finish within a few minutes even on relatively large databases. 4. `update-feeds.php` has been overhauled to improve performance and ease of use, and also to make errors easier to detect and eliminate. The overhaul doesn't require any changes to your set up *if* you used XML-RPC pings, or command-line PHP, to do scheduled updates. It *does* affect you if you used curl or some other tool to send HTTP requests to `update-feeds.php`: your old cron job will probably not work anymore. See [Setting Up Feed Updates][] below to get scheduled updates back on track. 5. Enjoy your new installation of FeedWordPress. #### New Installations #### 1. Install `feedwordpress.php` in your WordPress `plugins` directory and `update-feeds.php` in your WordPress `wp-content` directory. 2. (Optional) Upgrade the copy of MagpieRSS packaged with WordPress by installing the new `rss-functions.php` (archived in `OPTIONAL/wp-includes`) into your WordPress `wp-includes` directory. Upgrading MagpieRSS is necessary if you want to take advantage of support for Atom 1.0, multiple post categories, RSS enclosures, and multiple character encodings. (Note, however, that support for transliterating between character encodings is a very complex and iffy prospect in some PHP environments, so if you intend to use a lot of feeds with alternate encodings you should make sure that your installation of PHP is up-to-date and that you keep a copy of the old MagpieRSS around to compare results.) 3. Log in to the WordPress Dashboard and activate the FeedWordPress plugin. 4. While you're at the Dashboard, once the plugin is activated, you can go to Options --> Syndication and set (1) the link category that FeedWordPress will syndicate links from (by default, "Contributors"), and (2) a "secret word" for your RPC-XML updating interface. This provides some light security by keeping passing ruffians from saying "Update all the feeds" at will to your FeedWordPress installation. 5. Go to Links --> Syndicated to set up the list of sites that you want FeedWordPress to syndicate onto your blog. (If you have the feeds you want to aggregate in a service such as Bloglines, you may prefer to export them to an OPML file and use WordPress's Links --> Import to import them into the contributors category.) #### Setting Up Feed Updates #### FeedWordPress is now ready to accept posts from its syndication sources. Unfortunately, it doesn't yet know *when* to go get them. (**This may be true even if you are upgrading an existing installation of FeedWordPress:** your old cron job will still work if you used command-line PHP or blogging software pings to do updates, but it will need to be fixed if you used curl or another tool to send HTTP requests to `update-feeds.php`.) You can load in syndicated posts for the first time by pointing your web browser to `update-feeds.php`. If you have WordPress installed at, say, then you should point your browser to and log in as any user in the user database. (You may want to create a new "dummy" user for doing scheduled updates, using **Users --> Authors & Users --> Add New User**. Tell FeedWordPress to update all feeds, and you'll get the first wave of posts imported into the database. Congratulations! You should now have an aggregator site full of delicious syndicated content hot off of the newswires. Now you just need a way to *keep* the content freshly updated. Unless you enjoy manually browsing to `update-feeds.php` every hour on the hour, you'll probably want to do this by setting up your site for automated updates. You can pull that off in one of two ways, or by a mixture of both: 1. **The Blogging Software Ping Method:** You can get all of your contributors to add you to the list of URIs that they notify of updates: while FeedWordPress is activated, it will accept XML-RPC "recently updated" pings in the standard format accepted by Weblogs.com, Ping-O-Matic, Technorati, and other services. Most blogging software allows users to add a URI to the list of URIs that get pinged on each update. (See, for example, Options --> Writing --> Update Services in WordPress, or Configuration --> Preferences --> Publicity / Remote Interfaces / TrackBack in Movable Type.) If you can get a contributor to add your XML-RPC URI to her services-to-ping list (if you have WordPress installed at , say, the URI to add should be ), then whenever she updates her blog, her blogging software will ping your FeedWordPress installation, and FeedWordPress will look up her feed to grab the new posts off of it. 2. **The Scheduled Update Job Method:** You may very well not be able to get all your contributors to add your site to their blogging software's ping list, and even if you do you may want to have a back-up option to catch updates later even if the ping fails to go through on one particular occasion. You'll need to create a scheduled job to periodically check for updates on *all* the feeds. You'll need either (a) the ability to create cron jobs on your web host or (b) access to another computer with a reliable, always-on Internet connection. If you *can* create a crontab on your web host, then the best thing to do is to create a cron job that will run update-feeds.php through the PHP command-line interface. For example, if you have WordPress installed in `~/www/wp` (where ~ is your home directory), you might insert the following line into your crontab: 25 * * * * cd $HOME/www/wp/wp-content ; php -q update-feeds.php If you *don't* have access to (a), you can still save the day using another computer with always-on Internet access that sends a POST request to the `update-feeds.php` URI on a regular schedule. So, for example, if you have WordPress installed at , and you have a dummy user in your WordPress database with the login name 'login' and the password 'pass', then you could add the following line to the crontab on a home Linux box: 25 * * * * curl --user login:pass http://www.zyx.com/blog/wp-content/update-feeds.php -d update=quiet The `-d update=quiet` switch ensures that (1) `update-feeds.php` will receive an HTTP POST request rather than an HTTP GET request (which is important, since it won't take any actions with side-effects -- such as checking for new posts -- unless it receives an HTTP POST); it also tells it to suppress the HTML output that it would generate for normal web browsers, and only to output text if it encounters errors (this will keep the number of e-mails you receive from the Cron Daemon to a minimum). If you are using Windows XP and have a version of curl (such as the version included in [Cygwin][]), you can create a Scheduled Task to similar effect. [Cygwin]: http://www.cygwin.com/ Basic Concepts -------------- FeedWordPress is written as a plugin for [WordPress 1.5][]. It is designed to store all the data it needs within the WordPress database and to make that data easy to manage from within the WordPress Dashboard. ### Contributors / Newsfeeds ### FeedWordPress uses the WordPress Links database to keep a list of the feeds from which it will syndicate content. WordPress allows you to place links in categories; FeedWordPress will make use of all and only the links in one category (by default, this is a category named "Contributors"; you can change the category that FeedWordPress will use using **Options --> Syndication**). From WordPress's perspective, the list of Contributors are normal links, and they can be manipulated like other links through the WordPress Dashboard. However, FeedWordPress provides a nicer interface for adding, removing, or changing information for the Contributor Links from the WordPress Dashboard, under **Links --> Syndicated**. If you want to distribute the labor of adding, updating, and managing feeds between several people, you can use the WordPress login andaccess privileges system. Any users with an access level of 5 or greater can add, delete, and modify Contributors; users with an access level of 6 or greater can change syndication options. When FeedWordPress looks for new posts, it retrieves one or all of the links from the Contributors category (depending on whether it has been told to scan for new posts on one or all of the feeds), determines which of them should be polled for updates (based on how long it has been since the last time each feed was polled for updates), and then uses an HTTP conditional GET to check for updates at the "RSS URI" for each Link that it selects. Any new posts are added to the database, and old posts that have been updated since the last poll are updated to reflect the new version. __Feed settings:__ All of the information for a syndicated feed is stored in the WordPress Links database, and can be easily edited using an interface that FeedWordPress provides under **Links --> Syndicated**. (If you're curious about the technical details of how the information is stored, you can find out more under [API: How feed information is stored][].) You can use a feed's **Edit** link under **Links --> Syndicated** to affect how FeedWordPress prcesses posts from that feed. (Most of these options can either be set for *one particular feed* using **Links --> Syndicated --> Edit**, or set as the default for *all feeds* using **Options --> Syndication**.) The **Edit** link also allows you to set **Custom Feed Settings** for use in templates, through the use of the [`get_feed_meta()`][get_feed_meta] template function in a post context (see [Template API][]). For example, many aggregator sites use a "face" image for each feed to visually distinguish posts from different feeds. To implement a face feature, you could add a custom setting for each Contributor Link, with the key of "face" and a URI such as "http://www.zyx.com/mugs/ugly" for the value. (The URI should be changed out for each feed to point to the appropriate image, of course.) Then, to use the setting from within a template, add something like: // In a post context 0): ?> ... which will display the image, if any, whose URI is set in the "face" setting for the feed that post comes from. If there is no "face" setting for a particular feed, [`get_feed_meta()`][get_feed_meta] will return an empty string and no image will be displayed. [API: How feed information is stored]: http://projects.radgeek.com/feedwordpress/api#how-feed-information-is-stored ### Syndicated Posts ### Whenever FeedWordPress updates, it scans one or more of the feeds in its Contributors list and adds any new posts that it finds to the WordPress database. Syndicated posts are displayed on your WordPress pages like any other posts: they can be listed in archives by category, author, or date; they can be found with the search box; and they are included in the newsfeed of your blog. In your WordPress templates (**Presentation --> Theme Editor**) you can access special information about syndicated posts using [functions provided by FeedWordPress][Template API], such as [`is_syndicated()`][is_syndicated], [`the_syndication_source()`][the_syndication_source], [`the_syndication_source_link()`][the_syndication_source_link], and [`get_feed_meta()`][get_feed_meta]. For example, here is the template code that I use (in a post context) to display both the author's name and the original source of the post in the templates for [Feminist Blogs][]: from '; the_syndication_source(); echo ''; endif; ?> For more information on template functions, see [Template API][]. ### Categories ### WordPress allows for posts to be placed in *categories*. Each syndicated post that FeedWordPress adds to the WordPress database is placed into a set of categories. FeedWordPress gets the list of category names to use from two sources: 1. Categories (or "tags") that the original author placed the post in on her blog 2. Categories that you set explicitly for each feed using the **Categories** checkbox under **Links --> Syndicated --> Edit**. For example, if you wanted all the posts from Alas, A Blog to be placed in the "Pacific Northwest" category and the "Cartoonists" category (*in addition to* any other categories that they were placed in on Alas, A Blog), you could do this by creating the categories, going to **Links --> Syndicated**, clicking the "Edit" link for Alas, A Blog, and checking those two categories under the checkbox captioned "Categories." Given the list of category names, FeedWordPress looks for categories in the WordPress database with the same name as either (1) the category name, or (2) one of the "aliases" listed in the category description. __Aliases:__ Different often authors use slightly different names for categories that mean the same thing (contributors to Feminist Blogs, for example, used categories including "Feminism", "feministy stuff", "Women's Issues", "Gender Issues", "Gender Equality", and so on). If you want FeedWordPress to treat one category name as a synonym for another, you can do so by creating an "alias" for the category. For example, to make FeedWordPress treat posts that are placed in the category "feministy stuff" as if they had been placed in the category "Feminism", go to **Manage --> Categories**, find the category "Feminism" and click the "Edit" link for it, and then add the following to the Description field, on a line by itself: a.k.a.: feministy stuff You can add as many aliases as you like. You can also add any other text that you like to the Description without interfering with FeedWordPress's ability to use the aliases. Each alias must be on a line by itself. __Unfamiliar categories:__ If one of the category names that a newsfeed provides is *unfamiliar* -- that is, if there is not yet any category in your WordPress database that either has that name, or uses that name as an alias -- then by default FeedWordPress will *automatically create* a new category with that name and place the current post in it. The default behavior can be changed so that unfamiliar categories will *not* be added to the database, using the **Unfamiliar categories** setting, either for *all* feeds (under **Options --> Syndication**) or for *one particular feed* (under **Links --> Syndicated**). If you choose to disable the creation of new categories, either for all feeds or for one particular feed, then you can also choose whether or not FeedWordPress should syndicate posts that do not match *any* of the categories that are currently in the database. This allows you to do some simple filtering of posts by category: if you want to your blog to syndicate only the posts in one particular category from a feed that has several categories, you could do so by creating a category by that name, adding the new feed(s), and then setting **Unfamiliar categories** under **Links --> Syndicated --> Edit** to "don't create new categories and don't syndicate posts unless they match at least one familiar category". Since only posts in categories that are in your database will be included, and only the category or categories that you wanted posts from has been added to your database, this will filter out all the posts that aren't in the category or categories that you defined ahead of time. (Similarly, you could set up FeedWordPress so that *all* the feeds are filtered by author by creating the set of users named after the authors you want to syndicate, and then setting the default behavior for *all* feeds at **Options --> Syndication**). If you need a category filter with more complex logic, you can always create a `syndicated_item` filter in PHP (see [Plugin API][]) that manipulates the `['categories']` array of a syndicated item. ### Authors ### Most newsfeeds include information about the author of the items on them. (If a feed doesn't, then FeedWordPress will create an author's name based on the title of the feed from which the item was taken.) This information is used to determine the WordPress user that the post will be attributed to. Given the name of the author, FeedWordPress looks for authors in the WordPress database with the same name as either (1) their login, (2) their first name, (3) their nickname, (4) their full name, or (5) one of the "aliases" listed in the user's profile. __Aliases:__ If there is an author who posts under more than one name (for example, one of our contributors at [Feminist Blogs][] posts on several different blogs, sometimes using her full name and sometimes using only her first name), then you can ensure that FeedWordPress will attribute those posts to the same author by creating "aliases" for the author. For example, to make FeedWordPress treat posts by "Joseph Cardinal Ratzinger" and posts by "Pope Benedict XVI" as having the same author, go to **Users --> Authors & Users**, click on the "Edit" link for Pope Benedict XVI, and add a line like this to the Profile text: a.k.a.: Joseph Cardinal Ratzinger You can add as many aliases as you like. You can also add any other text that you like to the Profile without interfering with FeedWordPress's ability to use the aliases. Each alias must be on a line by itself. __Unfamiliar authors:__ By default, if the author named by the newsfeed is unfamiliar -- that is, if there is no-one with that name registered in the WordPress author's database -- then by default FeedWordPress will automatically create a new user account with the given name and attribute the post to the new user. The default behavior can be changed, using either the global settings in **Options --> Syndication** or the [feed settings][] under **Links --> Syndicated --> Edit**, so that posts by unfamiliar authors will either be attributed to a default author (instead of creating a new user account to attribute them to), or filtered out and not syndicated at all. One of the uses of this feature is to filtering posts by author: if you want to your blog to syndicate only the posts by one particular author from a feed that has several authors, you could do so by creating a user account with that author's name, adding the new feed(s), and then setting **Unfamiliar authors** under **Links --> Syndicated --> Edit** to "don't syndicate the post". Since only posts by authors that are in your database will be included, and only the author that you wanted posts from has been added to your database, this will filter out posts by anyone else on the feeds with that setting. (Similarly, you could set up FeedWordPress so that *all* the feeds are filtered by author by creating the set of users named after the authors you want to syndicate, and then setting the default behavior for *all* feeds at **Options --> Syndication**). If you need an author filter with more complex logic than this allows, you can always create a `syndicated_item` filter in PHP (see [Plugin API][]) that manipulates the `['author_name']` or `['dc']['creator']` elements of a syndicated item. Template API ------------ When activated, FeedWordPress makes the following functions available for use by themes/templates: * ``is_syndicated()``: in a post context, returns ``TRUE`` if the post was syndicated from another website, or ``FALSE`` if it was originally posted here * ``get_syndication_permalink()``: in a post context, returns the URI of the permalink for this post *on the website it was syndicated from* * ``the_syndication_permalink()``: in a post context, outputs the value returned by [``get_syndication_permalink()``][get_syndication_permalink] * ``get_syndication_source_link()``: in a post context, returns the URI of the front page (*not* the feed) of the website this post was syndicated from * ``the_syndication_source_link()``: in a post context, outputs the URI returned by [``get_syndication_source_link()``][get_syndication_source_link] * ``get_syndication_source()``: in a post context, returns the human-readable title of the website that a syndicated post was syndicated from * ``the_syndication_source()``: in a post context, outputs the value returned by [``get_syndication_source()``][get_syndication_source] * ``get_syndication_feed():`` in a post context, returns the URI of the feed (*not* the front page) that this post was syndicated from * ``the_syndication_feed()``: in a post context, outputs the value returned by [``get_syndication_feed()``][get_syndication_feed] * ``get_feed_meta($key)``: in a post context, returns the value, if any, of the feed setting ``$key`` for the feed that this post was syndicated from By default, FeedWordPress also places a filter on the standard functions ``get_permalink()`` and ``the_permalink()`` that substitutes the URI returned by [``get_syndication_permalink()``][get_syndication_permalink] for the URI generated by WordPress. This means that by default the permalinks listed on your website and in your newsfeed will link to the location of the posts on the source website, *not* to their location on your website. You can switch this behavior on or off at **Options --> Syndication** in the WordPress Dashboard. Plugin API ---------- FeedWordPress creates five hooks through the WordPress plugin architecture that you can plug in to using PHP WordPress plugins, to supplement ordinary FeedWordPress behavior, or to filter posts according to criteria that you set. The hooks are the action ``feedwordpress_update``, the action ``feedwordpress_check_feed``, the action ``feedwordpress_update_complete``, the filter ``syndicated_item``, the filter ``syndicated_post``, the action ``post_syndicated_item``, and the action ``update_syndicated_item``. For more information, see . License ------- The FeedWordPress plugin is copyright (c) 2005 by Charles Johnson. It uses code derived or translated from: - [wp-rss-aggregate.php][] by [Kellan Elliot-McCrea](kellan@protest.net) - [MagpieRSS][] by [Kellan Elliot-McCrea](kellan@protest.net) - [HTTP Navigator 2][] by [Keyvan Minoukadeh](keyvan@k1m.com) - [Ultra-Liberal Feed Finder][] by [Mark Pilgrim](mark@diveintomark.org) according to the terms of the [GNU General Public License][]. This program is free software; you can redistribute it and/or modify it under the terms of the [GNU General Public License][] as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. [wp-rss-aggregate.php]: http://laughingmeme.org/archives/002203.html [MagpieRSS]: http://magpierss.sourceforge.net/ [HTTP Navigator 2]: http://www.keyvan.net/2004/11/16/http-navigator/ [Ultra-Liberal Feed Finder]: http://diveintomark.org/projects/feed_finder/ [GNU General Public License]: http://www.gnu.org/copyleft/gpl.html